This concept of data satellites is really useful. I think it originates from @petter 's book - which I highly recommend. Really thorough and detailed.
One thing I am struggling to wrap my head around is whether there is ever any duplicated data fields between a data type and its satellite. Even simple lightweight structured data fields like a “name”
Take the example in the book - when we want to create a booking.com type app with the following type of search functionality:
Initially the book describes creating one datatype container for geographies (e.g continent, country, destination) and one for points of interest (hotels and restaurants). These data containers contain the bulk of the information - images and unstructured data.
So far so good - this is intuitive and makes sense. We can use these containers to hold “heavier” data (images and longer descriptions) and we group data concepts with similar information into a single container (e.g geography and points of interest) respectively and identify them within their containers through options sets (continent, country, destination vs hotel, restaurant)
However when it comes to actually creating the search functionality we don’t necessarily want to reference the data type containers as they hold images and unstructured data. Instead, the book suggests creating a separate lightweight search/satellite data type - so as in the image above this might contain whether it is a hotel, restaurant, continent, country or destination and then an identifying name (Dubai, Shangrila etc)
So my question after that long-winded setup is:
do we want to include the identifying name (i.e the name of the restaurant, the name of the country etc) on both the container datatype and the search/satellite data type?
Instinctively I would say no - just pick the datatype where it is most appropriate - in this case the search/satellite. If we created a hotel page we will predominantly bring in data from the container data type. Of course we will also need to know the hotel name but presumably we can just reference the hotel name as a sub thing. So something like hotel container’s search data’s name. As the search data type is lightweight and we are only searching one row I don’t think this would have noticeable performance issues. (We wouldn’t want to reference it the other way for something else as that would mean downloading all the data stored on the container datatype - but for this way around it seems ok)
However later on, the book adds a paragraph on synching data between different data types. So in this case, if we changed the name of the hotel in the container data type it talks about creating a backend trigger to update the name of the hotel in the search/satellite data type. This seems to imply that the name of the hotel is being stored on both the container data type and the search/satellite data type.
If that is the correct understanding does maintaining these dual data fields potentially have WU implications for the app (not to mention added complexity to track)
I appreciate there may not be a black and white answer to this but would appreciate @petter or anyone’s thoughts.