Being the main topic of the "Introduction to database systems" course at EPFL, this application aims at dealing with large amounts of data and making them accessible through common user interfaces. Tens of millions of database records had to be gathered, imported, formatted and then accessed and modified through a web application developed in PHP. The whole application had to satisfy strict timing constraints about queries and loading times.
Conceptual modelling and normalization constraints
Developed in parallel with the lectures learning us what are the good practices in the field, the conceptual model for the database had to respect numerous constraint, and was reviewed by PhD teaching assistants who advised us about improvements that could be made. Despite "too much normalized", our model used to be very easy to use and program around.
Advanced SQL queries
A list of SQL queries was asked to design. Most of them were very specific, e.g. "For each country, list the most frequent character name that appears in the productions of a production company (not a distributor) from that country". In order to process them within an acceptable response time, it was necessary to choose carefully different indexes, choose the order of join etc.
Query execution plan understanding and performance optimization
The final goal of the course and project was to learn to optimize queries and obtain a system as efficient as possible. Research has been conducted on query execution plans given by MySQL, and optimizations were profiled and applied to various parts of the program and underlying queries to obtain response times inferior to 1 second on a locally stored database with >60 millions tuples running on a laptop.