Understanding distributed data

This topic applies to ArcGIS for Desktop Standard and ArcGIS for Desktop Advanced only.

Data distribution involves creating copies of data and dispensing it between two or more geodatabases. It allows two or more offices to be working on the same data in separate locations.

Data is distributed as a means to improve data availability and performance by alleviating server contention and slow network access to a central server. This can help an organization balance the load on its geodatabases between users performing edits and those accessing it for reading operations.

Distributing data is also required for mobile users or contractors who need to take part of their geodatabase into the field to edit, disconnecting from the network entirely for an indefinite amount of time.

There are several ways to distribute your data across multiple geodatabases:

Copy and Paste

Some organizations have achieved a level of data distribution by saving copies of their geodatabases on CDs and DVDs and sending them to other offices. These offices can then work on the data, make edits, and send a copy of their updated geodatabase back to the main office. Here, edits are compared and coordinated such that the data at the two offices is in sync. This solution may work with careful communication, but there are many opportunities for updates to be lost, and it is difficult to keep the two geodatabases in sync.

Geodatabase Replication

Geodatabase replication is a data distribution method provided through ArcGIS. With geodatabase replication, data is distributed across two or more geodatabases by replicating all or part of your dataset. When a dataset is replicated, a replica pair is created; one replica resides in the original geodatabase, and a related replica is distributed to a different geodatabase. Any changes made to these replicas in their respective geodatabases can be synchronized so that the data in one replica matches that in the related replica.

Geodatabase replication is built on top of the versioning environment and supports the full geodatabase data model including topologies, networks, terrains, relationships, and so forth. In this asynchronous model, the replication is loosely coupled, meaning that each replicated geodatabase can work independently, and all changes can still be synchronized. Since it is implemented at the geodatabase level, the DBMSs involved can be different. For example, one replica geodatabase could be built on top of SQL Server, and the other can be built on top of Oracle.

Geodatabase replication can be used in connected and disconnected environments. It can also work with local geodatabase connections, as well as geodata server objects, which allow you to access a geodatabase on the Internet.

Learn more about preparing data for replication

DBMS Replication

DBMSs also have their own replication mechanisms in place, which can be used to make copies of and synchronize geodatabase data.

DBMS replication refers to the built-in replication mechanisms provided by the DBMS in which the geodatabase is stored. DBMS replication is not geodatabase aware. This means that geodatabase constructs, such as relationship classes and geometric networks, are not known by the DBMS. However, DBMS replication can still be configured to work in a limited way with geodatabase data.

DBMS Replication versus Geodatabase Replication

The following facts compare geodatabase replication and DBMS replication:

Related Topics

7/30/2013