I have been given a project to query various API’s in our environment like CloudStack and Kubernetes and update an RDBMS database (currently mysql 8) that is used for reporting and capacity planning. There is an existing LAMP-based portal being used for reporting so I’m stuck just augmenting what they have for now; long term I plan to move this into Kubernetes and use something like elasticsearch or solr instead of an RDBMS.
Anyhooo… for the immediate term.
What is the best approach to keeping a database current where you have a lot of churn on rows - some being updated, new ones being inserted and ones being deleted. Updating and inserting is pretty straight forward, I was more curious the best approach to deleting rows that reference objects that no longer exist in our environment.
Is there a “slicker” approach than this for processing objects that need to be deleted:
- run my collection job and get current list of objects
- query database for list of objects
- delete items that don’t exist in the results from step 1
This is a very costly approach and curious if there was a better approach, or library or something? Keeping in mind I can’t change the database at this time.