|
| 1 | +# VA ONLINE MEMORIAL - DATA IMPORT & SYNC |
| 2 | + |
| 3 | +## Dependencies |
| 4 | +- [Nodejs](https://nodejs.org/en/) |
| 5 | +- [PostgreSQL](https://www.postgresql.org/) |
| 6 | +- [eslint](http://eslint.org/) |
| 7 | + |
| 8 | +## Configuration |
| 9 | +- Edit configuration in `config/default.json` and |
| 10 | +- custom environment variables names in `config/custom-environment-variables.json`, |
| 11 | + |
| 12 | +## Application constants |
| 13 | + |
| 14 | +- Application constants can be configured in `./constants.js` |
| 15 | + |
| 16 | +## Available tools |
| 17 | + |
| 18 | +- Since the data we need to download and process is huge it's better (/ safer) to use 2 different tools instead of one single script so in case that something goes wrong during processing, we'll minimise the damage. |
| 19 | + |
| 20 | +### Download datasets |
| 21 | + |
| 22 | +- Run `npm run download-data` to download all available datasets. |
| 23 | +- The datasets will be stored in the configured directory. |
| 24 | +- Old data will be replaced. |
| 25 | +- This operation does not affect the database. |
| 26 | + |
| 27 | +### Import data from downloaded files |
| 28 | + |
| 29 | +- Run `npm run import-data` to import all data using the downloaded files from the previous step. |
| 30 | + |
| 31 | +## Local Deployment |
| 32 | + |
| 33 | +*Before starting the application, make sure that PostgreSQL is running and you have configured everything correctly in `config/default.json`* |
| 34 | + |
| 35 | +- Install dependencies `npm i` |
| 36 | +- Run lint check `npm run lint` |
| 37 | +- Start app `npm start`. This will run all tools in the following sequence: |
| 38 | + |
| 39 | +`npm run download-data` => `npm run import-data` |
| 40 | + |
| 41 | +*The application will print progress information and the results in the terminal.* |
| 42 | + |
| 43 | +## Verification |
| 44 | + |
| 45 | +- To verify that the data is imported, you can use the [pgAdmin](https://www.pgadmin.org/) tool and browser the database. |
| 46 | + |
| 47 | +## Notes: |
| 48 | + |
| 49 | +- The total size of all datasets is > 1.5GB so it will take quite some time, depending on your internet connection, to finish the operation. |
| 50 | +- `max_old_space_size` has been set to *4096MB* to allow parse/process such huge data files without any issues. The app will clean the memory right after using the data to prevent memory/heap leaks. |
| 51 | +- The dataset for `FOREIGN ADDRESSES` doesn't have a header in the CSV file and it has slightly different format (it has an extra column). The app handles all datasets without any issue. |
0 commit comments