Impact Factor
7.883
Call For Paper
Volume: 12 Issue 06 June 2026
LICENSE
Dropline: An Asynchronous Rag Architecture For Resolving Wrapper Link Bottlenecks
-
Author(s):
Aditya Shankar Khorne | Ajay Nabaji Virkar | Abhijit RamkisanYamgar
-
Keywords:
-
Abstract:
We Present DropLine, An AI-assisted System That Converts Arbitrary Web Links Into Structured Educational Content Using Retrieval-augmented Generation (RAG). The System Addresses The Problem Of “signpost Versus Destination” Links, Where Many URLs Function Only As Pointers To Boilerplate HTML While Their Intended Information Remains Hidden Behind Dynamic Frames, Redirects, Or Embedded Media. We Formalize This As A URI Resolution Challenge Combined With Dynamic Content Accessibility. DropLine’s Backend Is Implemented With FastAPI For Asynchronous Throughput. It Resolves URLs, Follows Redirects And Shorteners, Detects Content Type Such As Webpage, Video, Or Image, And Extracts Content Using Specialized Modules. For Web Pages, We Use Trafilatura To Perform High-precision Boilerplate Removal; For Videos, We Use The Youtube-transcript-api To Retrieve Captions Without Requiring An API Key. The Cleaned Text, Such As A Wikipedia Article Or Video Transcript, Is Then Passed To Google DeepMind’s Gemini 2.5 Flash Model, Which Supports A Large Multimodal Context Window. Our Prompt Instructs Gemini To Generate Five Pedagogical Outputs: A Concise Summary, Key Concepts, An Analogy-based Explanation For Beginners, Real-world Applications, And A Short Quiz. A Streamlit User Interface Presents The Result As An Interactive Tutor. Using St.session_state, We Cache The Document Context And Conversation History, Allowing Users To Ask Follow-up Questions Grounded In The Same Source Material. In Evaluation, DropLine Successfully Processed Over 1.2 Million Characters Of Raw HTML From A Wikipedia Case Study And Produced Approximately 180,232 Characters Of Clean Text, Retaining About 15% Of The Original Content, Which Is Consistent With Known Benchmarks. The System Also Recovered From HTTP 503 And 429 Errors Using An Exponential Backoff Strategy. These Results Show That The Asynchronous RAG Pipeline Can Reliably Bridge Modern Web Links And Rich Multimodal Knowledge Deliver.
Other Details
-
Paper id:
IJSARTV12I4105198
-
Published in:
Volume: 12 Issue: 4 April 2026
-
Publication Date:
2026-04-29
Download Article