DocReader

Indranil Chandra

Indranil Chandra

Mumbai, Maharashtra

1 0
  • 0 Collaborators

DocReader would search for the most contextually relevant document in the local file-system according to any keyword as asked by the user using ML/DL techniques and then transform the contextually relevant document into an audio file that can be played by the voice assistant. DocReader is intended to be built for increasing accessibility and readability of documents in local filesystem for an user who is either visually impaired or is involved in some activity in which their hands are not free to use the system physically. ...learn more

Project status: Concept

Internet of Things, Artificial Intelligence, PC Skills

Groups
Early Innovation for PC Skills

Intel Technologies
AI DevCloud / Xeon, Intel Python

Overview / Usage

SKILL -> Search for the most contextually relevant document in the local file-system according to any keyword as asked by the user using ML/DL techniques (Intent Extraction and Topic Modelling) and then transform the contextually relevant document into an audio file that can be played by the voice assistant.

PURPOSE -> Intended to be built for increasing accessibility and readability of documents in local filesystem for an user who is either visually impaired or is involved in some activity in which their hands are not free to use the system physically.

Methodology / Approach

  • User will interact with the Voice Assistant directly to search for any keyword or topic or file name in their PC.
  • The keyword will be passed to a Cloud Server (AWS/Microsoft Azure).
  • Intent 1 - "Keyword Search", will be parsed and interpreted from the Voice Assistant at a Cloud Server (AWS/Microsoft Azure).
  • Intent will be passed from the Cloud Server to the Client -side program written in C#.
  • The Client side program will hit an internal API hosted on the same PC that will return the list of most contextually relevant files from the local filesystem with respect to the searched keyword.
  • The Internal API would be the endpoint of a Python program hosted using Flask that does the task of Intent Classification and Topic Modelling using Machine Learning and Deep Learning techniques.
  • The result returned from the Internal API is passed to the Voice Assistant through Client-side program and Cloud Server.
  • User will select one of the files in case multiple file options are available. User will ask the Voice Assistant to read out the contents of that file.
  • The filename will be passed to a Cloud Server (AWS/Microsoft Azure).
  • Intent 2 - "Play File", will be parsed and interpreted from the Voice Assistant at a Cloud Server (AWS/Microsoft Azure).
  • Intent 2 (filename) will be passed from the Cloud Server to the Client -side program written in C#.
  • The Client side program will hit another Internal API hosted on the same PC that will run the Text-to-Speech conversion engine.
  • The result returned from the internally hosted Text-to-Speech conversion API (offline version of Microsoft Speech Service API) is encrypted at client-side program and passed to the Voice Assistant through Cloud Server.
  • This approach will also ensure data security as all of the contents of files are only read in the client-side program and never leave the local PC file-system in non-encrypted format.

Technologies Used

  1. Amazon Alexa Skills for the PC to allow the user to query any keyword or topic or document name to search for on their PC.
  2. Amazon Web Services.
  3. Microsoft Speech Service API.
  4. Intel NLP Architect to analyze the Natural Language statements to perform intent extraction and topic modelling.
  5. Intel Optimized Python.
  6. Intel Optimized TensorFlow.
  7. Intel AI DevCloud.
  8. C# .Net library
Comments (0)