02 - AccessAI

   ·    157       4       Full team       New Idea       Approved        0
Rainer Talvar
Team lead
  Social Inclusion     Public Services  
Challenges

What problem are you solving with the idea?

Our web extension eases the access to digital accessibility features for users who might face barriers when navigating the web due to mental or physical hardship. Many web pages today often contain complex or foreign languages, lack appropriate text-to-speech aids or fail to offer support for simplification. This can result in exclusion from the internet, a critical platform for economic, social, cultural, political, and civic participation in today's digital age, leaving that demographic at a significant disadvantage.

By leveraging the LLaMa 3.1 model and IBM cloud software, we are able to provide real-time web accessibility enhancements that include simplification of complex text and grammar, provide translations in users’ native languages and offer text-to-speech functionality for individuals with reading difficulties. Additionally, the extension has an incorporated chatbot specialising in advanced text manipulation such as summarising, explaining, highlighting key aspects, rephrasing, etc.

This solution bridges the gap for people with hardships, providing a layer of social inclusion in the web space that also aligns with current WCAG (Web Content Accessibility Guidelines), which ultimately leads to the best user experience.

What is your solution?

Our main goal of the Hackathon is to develop an MVP (Minimum Viable Product) that is able to provide access to all the aforementioned features.


The front end of the application will make use of JavaScript (JS) for background scripts that include tasks like content extraction, browser API interaction and communication with backend services. On top of that, we will use JS to hook straight into the DOM (Document Object Model) for the website that the user is browsing to capture relevant content on the web page that will be sent to the backend for the LLM to process. We will develop a simple UI interface for the chatbot and the overall application with CSS and HTML.

On the backend, we plan to use IBM Cloud to run the LLaMa 3.1 model alongside other IBM services that will enhance our product. The main purpose of the LLM is to handle requests like text-to-speech, advanced text manipulation, and translation. The backend will be developed using the FastAPI framework in Python to ensure efficiency and seamless performance. We will use a PostgreSQL + SQLALchemy database to store relational data such as user profiles, settings, and chat logs, while a cloud-based database will be utilised for storing and managing LLM-generated responses as these tend to be more unstructured and dynamic.

The flow of data should consist of something like this:
Content script (JS) → Background script (JS) → IBM cloud API → LLaMa 3.1 Model (Fine tuning with Python to ensure better quality and faster useability + WCAG compliance) → Return output to the user

To ensure real-time communication between the LLM and the user/website, we will be implementing WebSocket API so that features that interact directly with the website like text-to-speech/text simplification/translation can be done with as little latency as possible.

The chatbot will be implemented inside the extension with the LLaMa 3.1 model that has been fine-tuned to specialise in user support, the model should be able to interact with the user taking into account past context and the current website.

what i say?