For this project, I worked with multiple datasets containing Toronto Census data (3 million rows!!). I supplemented this data using web scraping and data mining.
Because my data came from several sources, a lot of data cleaning, data manipulation and joining happened. Much of these tasks were performed using SQL. Data was stored using a relational database.
I used clustering to identify similarities and groups between different neighborhoods based on different properties.
With that information, I developed a scoring method to rank and rate neighborhoods based on profiles and characteristics.
Finally, I created visualizations of findings, illustrating different relations and trends.
SEE MORE OF THIS PROJECT
Used: BeautifulSoup, Python, SQL, Scikit, MatPlotlib, Dash, Flask