Overview of My Submission
Konohagakure Search is a Google Like Search Engine. It is built on the technology of django, dedicated mongo db server, python, scrapy, spacy, and nltk. Whenever a search query is given it first search the database if its present then it searches up the internet then saves the data and then it presents to you in a pretty way :)
Overview of the project
An efficient Search Engine with the following features:
It has distributed crawlers to crawl the private/air-gapped networks (data sources in these networks might include websites, files, databases) and works behind sections of networks secured by firewalls
It uses AI/ML/NLP/BDA for better search (queries and results) It abides by the secure coding practices (and SANS Top 25 web vulnerability mitigation techniques.)
It is a type of a search engine which takes keyword/expression as an input and crawls the web (internal network or internet) to get all the relevant information. The application dosen't have any vulnerabilities, it complies with OWASP Top 10 Outcome. This application scrape data, match it with the query and give out relevant/related information.
Note - Search as robust as possible (eg, it can correct misspelt query, suggest similar search terms, etc) be creative in your approach. Result obtained from search engine should displays the relevant matches as per search query/keyword along with the time taken by search engine to fetch that result.
Submission Category:
Choose Your Own Adventure
Link to Code
Dhruvacube / search-engine
Google Like Search Engine
Konohagakure Search
Overview of the project:
An efficient Search Engine with the following features
It has distributed crawlers to crawl the private/air-gapped networks (data sources in these networks might include websites, files, databases) and works behind sections of networks secured by firewalls
It uses AI/ML/NLP/BDA for better search (queries and results) It abides by the secure coding practices (and SANS Top 25 web vulnerability mitigation techniques.)
It is a type of a search engine which takes keyword/expression as an input and crawls the web (internal network or internet) to get all the relevant information. The application dosen't have any vulnerabilities, it complies with OWASP Top 10 Outcome. This application scrape data, match it with the query and give out relevant/related information.
Note - Search as robust as possible (eg, it can correct misspelt query, suggest similar search terms, etc) be creative in your approach. Result obtained from search engine should
…Packages that were required in making this project
- django
- dj-database-url
- celery
- django-redis
- django-htmlmin
- gunicorn
- redis
- hiredis
- djongo
- pymongo[srv]
- python-dotenv
- requests
- beautifulsoup4
- textblob
- spacy
- nltk
- spacy-alignments
- spacy-legacy
- spacy-loggers
- colorama
- transformers
- Scrapy
- cdx-toolkit
- uvicorn
- whitenoise
- colorlog
- uvloop
- spacy-transformers
- spacy-lookups-data
- django-celery-beat
- django-cors-headers
A YouTube Video Explaining all
Top comments (0)