Full stack movie database blog post

Project Page

Project Github Repository

Recently I became aware of the IMDB datasets that are updated daily by IMDB so I decided to use them to create a full stack application.

For the backend I chose Java with Spring boot, for the database Postgres and for the frontend React. I wanted everything to be containerized so I used a docker-compose file combining two images, the Postgres:14 image and my custom app image.

Part one: Automating the ETL process

The data comes in 7 .gz files, so the process creates 7 threads. Each thread downloads and extracts the data file. Afterwards every file enters the preprocessing phase. This phase removes any mistakes in the data format and makes sure the syntax is ready to enter the postgres database. The last phase is the upsert phase using pure SQL. This phase needs to wait a bit for the JDBC connection to the database to be implemented.

Part two: Spring boot Controller, Service and DAO

The controller for the backend listens at /api/v1 for now. It accepts requests using the standard spring @RESTController functionality. The controller layer calls the appropriate method from the Service. The service in most situations passes the request unchanged to the DAO for the data to be fetched from the postgres database, sometimes though it processes the request for the proper query to be called from the DAO. The DAO is implemented as a @Repository using JDBCTemplate with a HikariDataSource. It uses .query* or .update with pure SQL for its operations.

Now that we have a JDBCTemplate connected to the database we can finish the upsert phase from part one. This is done using temp tables, copying the data from the processed files into the temp tables and then upserting them to the main tables discarding any data that do not comply with the schema’s constraints.

Part three: Frontend

The frontend is written in typescript React and is bundled within the project itself. The frontend listens at / and the backend at /api/v1/. I am not versed into frontend design so my main concern was not aesthetics but functionality (the css is custom and not something like bootstrap). The pagination in search results is handled by the backend and uses LIMIT and OFFSET in sql, so the frontend doesn’t fetch the full result lists and isn’t responsible for paginating them. I thought that would be the most responsive way that would minimize loading times.

Movie Database Schema:

Movie Database Schema

Running database update Docker output:

Database Update Docker output

Vasilis Vlasopoulos

Vasilis Vlasopoulos

dev/sec/sre

comments powered by Disqus