Problem when merging data from different sources. What attributes do key in the database?

Good day!

Problem description:
System for large firms engaged in marketing and data analysis on competitors
One of the functions of accounting information of companies and their structure (dochke, branches, etc.) and data transmission to the Central office
Ie client machine with the server installed in different cities (offices).

Attributes. For example:
Short name: Roel
Full name: LLC "NK rule"
INN: 546456456
Organizacna form: limited liability company
Country: Russia
Address: Syktyvkar, St. Mark's, 7
Activities: Sale and production of precious metals
Parent company: JSC "Rarus"
The lower of the company: LLC "FSC", LTD "MARRIAGE"
Owner: Sidorov PI

Analyst each day passes the information on to the Central office analysts.
The multiserver system. I.e. the DB data is not centralized. For information about the establishments is and their structure was identical to the structure information is transmitted to the Central office. Let's just say, for the alignment of a structure.

It is logical to assume that the key attribute is the "full name" or INN. But there is one but
the analysis identifies firms (dochke) which is not complete. For example:
"There is information about some of the company was in Balma, owner Mycol. There is information about the fact that this firm is associated with JSC "Rosadana". Please consider and check"

I.e. on one of the client machines will be fixed infa about objete without his name. And will need this info to send.

Please tell me which attribute (or something "more") to make the key and why?

with respect
July 9th 19 at 14:01
4 answers
July 9th 19 at 14:03
or something "else"

To create a separate id field and make a key.
Thanks for the reply. I have a question. Example:
Condition: there are 2 servers with the system and with different databases.
On the server No. 1 No. 1 started the company "YUKOS" with the branches of OOO "Yukos in the passenger seat".
On the server №2 of the Central office of the user No. 3 started the company "Yukos" but the presence of branches he doesn't know. On the server number 2 opened in addition to this company, other companies. User # 1 transmits the data of the OAO Yukos and its branches on server number 2. Question: How does the system (server No. 2) recognizes that the data came on the OAO "Yukos" and its docko and appropriately update the data? (sinhroniziruete, "equalize" structure) - Dean commented on July 9th 19 at 14:06
July 9th 19 at 14:05
id_company (int) all. Nothing else.
Have the same name of the objects on different servers, will have a different generated ID. While systems are on different servers, sinhroniziruete data? - Dean commented on July 9th 19 at 14:08
July 9th 19 at 14:07
Guys correctly write about aydishnik. Once you have the data completeness is not defined, the only ever-present element of the record is an additional field of aydishnik.
Michael thank you for your comments. I have a question which is addressed to all I guess. Example:
Condition: there are 2 servers with the system and with different databases.
On the server No. 1 No. 1 started the company "YUKOS" with the branches of OOO "Yukos in the passenger seat".
On the server №2 of the Central office of the user No. 3 started the company "Yukos" but the presence of branches he doesn't know. On the server number 2 opened in addition to this company, other companies. User # 1 transmits the data of the OAO Yukos and its branches on server number 2. Question: How does the system (server No. 2) recognizes that the data came on the OAO "Yukos" and its docko and appropriately update the data? (sinhroniziruete, "equalize" structure). As I understand the generated ID in the same object on different servers will be different? - Dean commented on July 9th 19 at 14:10
: honestly, I know this question is not very.
Replication if suitable then only synchronous.
And so it turns out that you need to split base on master and workers. And then the first differences are accumulated in a master database, and then... what in the logic of import in master database you can control the duplicates.
This is a separate-not even a song but a musical. - Dean commented on July 9th 19 at 14:13
: Thanks Michael. Yes, probably it is necessary to dig in this direction - Dean commented on July 9th 19 at 14:16
July 9th 19 at 14:09
Generated ID in this case is the only correct solution, int/long or UUID depending on the views/wishes. At the time, tried various identifiers like INN, KPP, etc. to do the keys, but nothing good came out of this - the usefulness of the data entered by the end user, as a rule, are not guaranteed, exactly as their reliability.
Thank you Alexander for the detailed answer. Yes, it is. the completeness of the input data is not guaranteed and reliability. Do I understand that each object has a "company"need to generate a unique id? And if so, then how the other server recognizes this object and it's suitable "place" in the structure of subordination? For example: On server No. 1 No. 1 started the company "YUKOS" with the branches of OOO "Yukos in the passenger seat". On the server №2 of the Central office of the user No. 3 started the company "Yukos" but the presence of branches he doesn't know. On the server number 2 opened in addition to this company, other companies. User # 1 transmits the data of the OAO Yukos and its branches on server number 2. Question: How does the system (server No. 2) recognizes that the data came on the OAO "Yukos" and accordingly update the data? (sinhroniziruete, "equalize" structure) - Dean commented on July 9th 19 at 14:12
Alexander Kosarev As I understand the generated ID in the same object on different servers will be different? And if so? how system sinhroniziruete data? - Dean commented on July 9th 19 at 14:15
: sync options are many, but a few will offer%
Option 1. Need server message queues (ActiveMQ, RabbitMQ, etc.). When the new company server throws in some topic message with the parameters established company (no one bothers to fully object to send an objectmessage). The other servers that are subscribed to this topic, get the message and give birth to/raise a company at home.
Option 2. All servers that implement a web service with a method of saving the company. Then, the server that started the company, will have to pull this method of the web services on other servers and send information about the company. But then you will need to figure out some kind of Service Discovery.

Actually, IMHO, the lack of centralization of the system in this case asks in the negative. Centralization would help to solve the problem without architectural crutches. - Dean commented on July 9th 19 at 14:18
And Yes, in this case it is more convenient to use a UUID as the identity of the company. - sunny.Hagenes98 commented on July 9th 19 at 14:21
there is another limitation:
-automatic synchronization is not possible
-the user sends the data manually
This is due to the principle of data dovedetsja in volume regards.

The question, still unclear how the other server will understand that the received information about said company? Because the id is different?

Server number 1:
OAO Yukos (with eight-dochke OOO "Bread")
ID: 0005

Server No. 2 (the user has created an entry about the company which he found):
OAO Yukos
--ID: 0354

When sending data from the server No. 1 server No. 2, how system will understand the object of OAO Yukos with an iD of 0005 is the same as the object of OAO Yukos iD 0354? - Dean commented on July 9th 19 at 14:24

Find more questions by tags System analysisJavaPostgreSQLDatabases