A simple web search engine built with ASP.NET Core that ranks results by word frequency and PageRank.
- ✨ Features
- 📋 Example
- 📁 Repository Structure
- 🚀 Getting Started
- 🧠 How It Works
- 🛠️ Technical Details
- 📝 Scraping & Indexing
- 🐳 Docker Image
- 📞 Contact
- Dual-ranking system: Compare results by frequency or PageRank
- Clean, responsive UI: User-friendly interface that works on all devices
- Fast performance: Optimized for quick search results
Here's what the search engine looks like in action:
Example showing search results for "help" with both frequency and PageRank rankings
Search-Engine/
├── Scraping&Indexing/ # Python scripts for data preparation
│ ├── scraping.py # Web crawler and PageRank calculator
│ ├── requirements.txt # requirments for running python files
│ ├── inverted_index.py # Creates searchable index
│ └── InsertDataIntoDB.py # Populates database
│
└── [ASP.NET Core files] # Main application files
- .NET 8 SDK or newer
- Visual Studio 2022 or Visual Studio Code
- Python 3.6+ with packages: requests, beautifulsoup4, pyodbc, networkx, pandas
- SQL Server Database
-
Clone the repository first
git clone https://github.com/Mohammed-Eissa/Search-Engine.git cd Search-Engine -
Install the required dependencies
pip install -r Scraping&Indexing/requirements.txt -
Run Python scripts (in the Scraping&Indexing folder)
Update link to scrap from file scraping.py if you want
cd Scraping&Indexing python scraping.py python inverted_index.py scraped_pages output_dir python InsertDataIntoDB.py
Update database connection in InsertDataIntoDB.py if needed
-
Configure and run the ASP.NET application for local run
- Return to the main directory:
cd .. - Go to .Net folder :
cd Web-code - Update connection string in appsettings.json
- Open the .sln file in Visual Studio or the folder in VS Code
- Run the application (F5 in Visual Studio or
dotnet runin terminal)
Or U can Run it with docker compose
- Return to the main directory:
cd .. - Go to .Net folder :
cd Web-code - run this command
docker-compose up -d - Return to the main directory:
- Web Scraping: Collects page content and link structure
- Inverted Indexing: Creates a searchable index of words and their locations
- PageRank Calculation: Determines page importance based on link relationships
- Database Storage: Stores all processed data for fast retrieval
- Search & Ranking: Retrieves and ranks results when users search
- Backend: ASP.NET Core MVC
- Frontend: HTML, CSS, JavaScript, Bootstrap
- Data Processing: Python (BeautifulSoup, NetworkX, Pandas)
- Data Storage: SQL Server Database
- Algorithms: Inverted Index, PageRank
The Python scripts in the Scraping&Indexing folder prepare data for the search engine:
- Crawls web pages from a seed URL
- Extracts text content and builds link graph
- Calculates PageRank scores
- Saves to text files and CSV
- Processes scraped content
- Creates word-to-document mapping with frequencies
- Outputs an inverted index file
- Creates database tables
- Loads PageRank scores
- Populates database with words, URLs, and mappings
Deploy the search engine effortlessly with Docker:
-
Pull the Image:
docker pull yousry97/searchengine:v1.0
-
Run the Container:
docker run -p 8080:8080 yousry97/searchengine:v1.0
Access the application at
http://localhost:8080. -
Docker Hub: yousry97/searchengine
To build the image locally:
docker build -t search-engine .Tip: Ensure Docker is installed and running on your system.
Mohammed Eissa - GitHub Profile
LinkedIn Profile
Project Link: https://github.com/Mohammed-Eissa/Search-Engine
