Managing Inconsistent Data from Multiple APIs

Updated: 13th October 2025
Tags: programming database

When working with multiple data sources, a common approach is to save the data as-is and move on. However, you often need to edit, unify, or supplement that information, and doing it incorrectly can create major headaches.

In our case, we use data from two APIs, each stored in its own table:

Each source may return slightly different versions of the same items. In the main products table, we store item_id_from_api1 and item_id_from_api2 to link each product to its original source data.


Initial Attempt: Unifying Data Early

I initially tried to unify item_name directly by adding a same_as field in the source tables:

id | item_name | same_as | created_at

When a slightly different spelling appeared, I would reference the main item’s id in same_as. A similar approach was used for items_from_api2.

This caused problems: some items had legacy or alternate names, and forcing them into a unified same_as mapping caused misidentifications. The key lesson: never modify source data to unify it—always store it exactly as it comes from the API.


Protecting Source Data

Sometimes we need to edit or add related items in the main products table, such as linking additional item_ids. However, we never change products columns item_id_from_api1 or item_id_from_api2, because doing so would break the link to the original API data and its relationships.


The Local Mapping Solution

To solve this, we implemented a local mapping system:

  1. Created a new table items_from_local and added a column item_id_from_local in the main products table.
  2. Added a local_id column to both items_from_api1 and items_from_api2, mapping each API item to its local equivalent.
  3. Ran a script that updates item_id_from_local in the main table based on the local_id mappings.

When a new item appears in an API table, the script:

This system allows both programmatic updates and manual adjustments, while preserving the integrity of the original API data. Admins can edit, add, or delete entries in the local table. All changes are logged internally. Users can filter or search using item_id_from_local without worrying about inconsistent or misspelled source items.


Benefits of This Approach


Diagram (Conceptual Flow)


       ┌─────────────────────────┐
       │     items_from_api1     │
(API 1 Source)       ├─────────────────────────┤
       │ id  │ name     │ local_id │
1   │ Item 112   │ Item 223   │ Item 334   │ Item 2`2       └─────────────────────────┘
               ┌─────────────────┐
               │ items_from_local│
(Unified Table)               ├─────────────────┤
               │ id  │ name      │
1   │ Item 12   │ Item 23   │ Item 3               └─────────────────┘
       ┌─────────────────────────┐
       │     items_from_api2     │
(API 2 Source)       ├─────────────────────────┤
       │ id  │ name     │ local_id │
1   │ Itemm 112   │ Itemm 223   │ Itemm 334   │ Itemm 2`2       └─────────────────────────┘
               ┌─────────────────┐
               │    products     │
(Main Table)               ├─────────────────┤
               │ id │ item_id_from_api1 │ item_id_from_api2 │ item_id_from_local │
1111222233334442               └─────────────────┘