LexiLearn Study Companion
Intelligent study tool that extracts unknown vocabulary and generates audio for learning
Planned EdTech
Problem Solved
Students reading in a second language encounter unknown words and lose comprehension. Digital reading tools don't provide context-aware vocabulary support. LexiLearn extracts vocabulary automatically and helps students master new words.
Core Features
- ✓Upload learning materials (PDF, EPUB, TXT)
- ✓Automatic vocabulary extraction and term frequency analysis
- ✓Context-aware definitions with usage examples
- ✓Text-to-speech generation with adjustable speed
- ✓Spaced repetition review system for vocabulary retention
- ✓Personal vocabulary flashcards with progress tracking
- ✓Reading statistics and comprehension insights
- ✓Export study materials to Anki format
Technical Architecture
┌────────────────────────────────────┐
│ Angular Frontend (SPA) │
│ - Document Upload │
│ - Reading Interface │
│ - Vocabulary Dashboard │
│ - Flashcard Reviewer │
└──────────────┬─────────────────────┘
│
┌────────┴────────┐
│ │
┌─────▼──────────────────▼──────┐
│ Spring Boot REST API │
│ - Auth Service │
│ - Document Parser │
│ - NLP Engine (vocabulary) │
│ - Review Service (SRS) │
│ - TTS Service │
└──────────────┬─────────────────┘
┌────────┴─────────┬──────────────┐
│ │ │
┌─────▼─────┐ ┌────────▼──┐ ┌──────▼────┐
│PostgreSQL │ │ MongoDB │ │Google TTS │
│(Progress) │ │(Document │ │ API │
│ │ │Content) │ │ │
└───────────┘ └───────────┘ └────────────┘
Tech Stack
Frontend
- • Angular 21
- • TypeScript
- • RxJS
- • TailwindCSS
- • Web Audio API
Backend
- • Java 17
- • Spring Boot 3
- • OpenNLP
- • Apache PDFBox
- • Google Cloud TTS
Data & DevOps
- • PostgreSQL
- • MongoDB
- • Redis (for caching)
- • AWS Lambda (document processing)
Technical Challenges
Natural Language Processing at Scale
Integrated OpenNLP for tokenization and POS tagging to extract meaningful vocabulary. Built a processing pipeline using Spring Batch to handle large PDF uploads asynchronously without blocking users.
Spaced Repetition Algorithm (SRS)
Implemented the SM-2 algorithm to calculate optimal review intervals based on user performance. Used PostgreSQL scheduling to auto-generate review tasks without manual intervention.
Future Improvements
- →AI-powered reading difficulty recommendations
- →Collaborative learning groups with shared vocabulary lists
- →Integration with popular books/articles APIs
- →Mobile app for offline studying
- →Integration with language learning platforms (Duolingo API)
- →Grammar checker with explanations