Connecting Business Systems
Most businesses run several systems, often five or more. A CRM for sales, an ERP for operations, an accounting package, a project management tool, maybe a legacy database that nobody wants to touch but everybody depends on. Each system holds a piece of the picture, and none of them share it willingly. The integration patterns you choose to connect them determine whether data flows reliably or gets lost along the way.
System integration is the work of making those systems exchange data reliably, on time, and without losing records along the way. It sounds straightforward until you realise that each system has its own data model, its own timing expectations, and its own ideas about what a "customer" or an "order" actually is.
The patterns in this guide come from real integration work: connecting CRMs to ERPs, bridging legacy databases to modern APIs, and keeping data synchronisation running when one system goes down at 2am on a Saturday. If you are looking for the business-level view of connecting your tools, start with systems integration and data flow. This page covers the engineering patterns underneath.
What system integration actually involves
System integration is not just connecting two APIs. It is reconciling differences in data models, timing, reliability guarantees, and update velocities across every system in a business. Each connection point acts as an anti-corruption layer, preventing one system's assumptions from leaking into another.
Consider a straightforward scenario: a new customer signs up on your website. That customer record needs to reach your CRM, your accounting system, and your project management tool. Each system expects different fields, in different formats, at different times. Your CRM wants the record immediately. Your accounting system wants it when the first invoice is raised. Your project tool wants it when the first project is created.
The naive approach is to write a direct connection from your website to each system. It works for two or three systems. By the time you reach five, you are maintaining ten separate connections, each with its own error handling, its own retry logic, and its own way of failing silently. Most of this work belongs in background job queues, not in the HTTP request cycle. The integration layer itself needs its own architecture.
The real cost of poor integration is rarely the integration itself. It is the downstream effects: duplicate customer records in your CRM, invoices sent to the wrong address, stock levels that are three hours out of date. These failures erode trust in your data, which leads people to maintain spreadsheets alongside the systems that were supposed to replace spreadsheets.
The four core integration patterns
Every system integration project uses one of four fundamental integration patterns, or a combination of them. The right choice depends on how many systems you are connecting, how fresh the data needs to be, and how much operational complexity you are willing to manage.
Point-to-point
Direct system-to-system connections. System A calls System B's API, gets or sends data, and handles errors itself.
When it works: Two or three systems with clear, stable interfaces. A single CRM integration with your website, for example.
When it breaks: With N systems, point-to-point requires N*(N-1)/2 connections for full mesh connectivity. Five systems means ten connections. Ten systems means 45.
Hub and spoke
A central integration hub handles all data transformation and routing. New systems connect only to the hub, not to every other system.
When it works: Five or more systems where data flows through a central process. ERP integration projects often land here because the ERP becomes the natural hub.
When it breaks: The hub is a single point of failure. If it goes down, every integration stops. You need monitoring, redundancy, and a team that understands the hub's configuration.
Event bus and message broker
Systems publish events ("order created", "customer updated") to a message broker. Other systems subscribe to the events they care about. Publishers do not know or care who is listening. This is the publish/subscribe pattern, and it is the dominant modern approach to event-driven architecture. Each event payload carries a self-describing message envelope with metadata (timestamp, source system, correlation ID) alongside the business data.
When it works: Many systems reacting to common business events. If adding a new system should not require changing existing systems, this is the right pattern. Topic design, consumer groups, and ordering guarantees matter here. A poorly designed topic structure creates the same coupling you were trying to eliminate.
When it breaks: Debugging is harder because the flow is indirect. Without monitoring dashboards and dead letter queue management, you will spend more time debugging than you saved in development. Tracing a failed event through three systems requires correlation IDs at every hop.
ETL (extract, transform, load)
Batch-oriented processing where data is extracted from source systems, transformed into the target format, and loaded into destination systems on a schedule.
When it works: Reporting, analytics, data warehousing, and any scenario where data does not need to be real-time.
When it breaks: When someone assumes it is real-time. If your stock levels update every four hours via ETL, your website will sell items that are already out of stock.
Before choosing a pattern, ask whether you should integrate at all. Not every data flow needs automated integration. If a process runs once a week and involves twenty records, a CSV export may genuinely be the right answer. The cost of building, testing, monitoring, and maintaining an integration must be lower than the cost of the manual process it replaces, including the cost of errors. If the break-even point is three years away and the business might change the systems involved before then, do not build it yet.
Common scenarios and recommended patterns
For UK SMBs running five to fifteen systems, these are the patterns we reach for most often.
| Scenario | Pattern | Timing |
|---|---|---|
| CRM to accounting (e.g. Salesforce to Xero) | Webhooks with queue | Near-real-time |
| Warehouse stock sync to website | Polling or CDC | Sub-5-minute |
| Legacy SQL Server ERP to Laravel | CDC or file-based export | Batch or near-real-time |
| Monthly financial consolidation | Batch ETL | Scheduled |
| Multi-system order pipeline | Event bus with workflow orchestration | Event-driven |
When CRM and ERP integration goes wrong
CRM integration and ERP integration are the two most common system integration projects we see. They are also the two most likely to fail, and the failures follow predictable patterns.
The duplicate record problem
Your sales team creates a contact in the CRM. Your accounts team creates the same person in the ERP. Now you have two records for the same entity, with slightly different data in each. The CRM says "J. Smith, Acme Ltd". The ERP says "John Smith, Acme Limited".
The robust pattern is a canonical identifier within a canonical data model: a single, system-generated ID that both systems use to refer to the same entity, with agreed field definitions that each integration maps to and from. One system is the authority for creating new entities. Other systems reference that authority's ID. This requires clear data modelling before any code is written, and an audit trail so you can trace how duplicate records were created when they inevitably slip through.
The timing mismatch
Your CRM updates in real-time. Your ERP runs batch imports every 30 minutes. A salesperson closes a deal, the CRM fires a webhook, and the integration tries to create an order in the ERP. But the ERP is mid-batch and rejects the write.
The robust pattern here is a message queue with guaranteed delivery. The integration writes the message to a queue. The ERP consumer picks it up when it is ready. If the ERP is busy, the message waits. If it fails permanently, it lands in a dead letter queue for human review.
The field mapping nightmare
Your CRM has a "Company Type" dropdown with five options. Your ERP has a "Customer Category" field with twelve. Neither maps cleanly to the other. Some CRM values map to two ERP categories depending on context. One ERP category has no CRM equivalent at all.
There is no shortcut here. Field mapping requires a spreadsheet, both teams in the room, and enough time to work through every edge case. The transformation logic that results is often the most complex part of the integration, frequently accounting for more effort than the connection layer itself, and the part most likely to need updating as either system evolves.
This is where data contracts become essential. A data contract is a versioned agreement between two systems about the shape, types, and constraints of the data they exchange. When System A adds a new enum value to "Company Type" or changes a field from a string to an array, the contract makes that change visible before it breaks the integration. Without contracts, you discover schema drift at 3am when the transformation logic throws an exception on a data shape it has never seen.
Data synchronisation strategies that hold up
Data synchronisation between systems requires clear answers to three questions: which direction does data flow, which system owns which fields, and what happens when two systems change the same record at the same time.
Sync direction
The source-of-truth-by-field approach
Rather than declaring one system the overall authority, assign ownership at the field level. Your CRM owns contact details. Your ERP owns financial data. Your project tool owns delivery dates. When a field is updated in its owning system, the change propagates outward. When someone tries to update a field in a non-owning system, the integration either rejects the change or flags it for review.
This approach, closely related to the single source of truth principle, eliminates most synchronisation conflicts by making ownership explicit.
Conflict resolution
| Strategy | How it works | Trade-off |
|---|---|---|
| Last write wins | The later update overwrites the earlier one | Simple but lossy |
| Source-of-truth wins | The owning system's value always takes precedence | Predictable and safe |
| Manual resolution | Flag conflicts for human review | Appropriate for high-value records |
For most UK mid-market businesses, source-of-truth-by-field with manual resolution for exceptions is the right balance between automation and safety.
Real-time vs batch vs near-real-time
The timing decision deserves more scrutiny than it usually gets. Most businesses asking for "real-time" integration actually need "fast enough", which typically means sub-5-minute latency. True real-time (event-driven, sub-second) costs significantly more to build and operate than near-real-time polling or micro-batching.
Inventory and order data genuinely need near-real-time sync. If your website sells an item that went out of stock three minutes ago, that is a real business cost. But reporting data, customer metadata updates, and financial reconciliation can tolerate hours of latency without consequence. Match the sync timing to the actual business impact of stale data, not to what feels correct. The infrastructure cost difference between event-driven and scheduled batch can be an order of magnitude, and that cost recurs every month.
Legacy system integration without the rewrite
Most businesses have at least one system that is ten or fifteen years old, runs on technology that is no longer mainstream, and is too critical to replace overnight. Legacy system integration is the work of connecting these systems to your modern stack without destabilising them.
File-based export and import
The legacy system writes a CSV or XML file to a shared location. A modern service picks it up, parses it, and routes the data. Unglamorous but reliable. Many legacy systems have been exporting flat files since before REST APIs existed.
Direct database connection
Connect to the legacy system's database and read or write data directly. This bypasses the legacy system's business logic. If the legacy system validates data on entry, your direct writes skip that validation. Use read-only connections where possible.
Wrapper service (strangler fig pattern)
Build a modern API that sits in front of the legacy system. New consumers talk to the wrapper. The wrapper translates requests into whatever the legacy system understands. Over time, you can migrate functionality from the legacy system to the wrapper without changing any consumers.
Screen scraping and UI automation
Automate the legacy system's user interface. This is a last resort. It is fragile, slow, and breaks whenever the UI changes. But sometimes it is the only option for systems with no API, no database access, and no file export capability.
The critical rule for legacy integration: Never change the legacy system's behaviour during integration. The legacy system works. People depend on it. Your integration must adapt to the legacy system, not the other way around.
Change data capture: the pattern between ETL and event-driven
Change data capture (CDC) monitors a database's transaction log and publishes an event for every insert, update, or delete. Tools like Debezium make this practical without modifying the source application. CDC sits in the gap between batch ETL (which waits for a schedule) and application-level event publishing (which requires code changes in the source system).
CDC is particularly relevant for legacy system integration. If the legacy system has a relational database but no API and no webhook support, CDC lets you capture every data change without touching the application code. You attach Debezium to the database's transaction log, and it streams changes to a Kafka topic or message queue. Your modern systems subscribe to those changes as if the legacy system were publishing events natively.
The trade-offs are real. CDC operates at the database level, so it captures raw row changes, not business events. A single business action (creating an order) might produce ten row changes across five tables. Your consumer needs to reconstruct the business meaning from those raw changes. For multi-step business operations that span several systems, a saga pattern can coordinate the individual steps, handling compensating transactions when one leg of the operation fails. CDC also requires access to the database's transaction log, which means database-level permissions and, for some databases, specific configuration changes. On actively migrating schemas, CDC consumers need to handle schema evolution gracefully, or they will break when a column is added or renamed.
API integration patterns for reliable data flow
When systems do expose APIs, the quality of your integration depends on how you handle the things that go wrong. Networks fail, services restart, rate limits trigger, and data arrives in unexpected formats. These integration patterns separate "works in testing" from "works in production".
Idempotent operations
An idempotent operation produces the same result whether you call it once or ten times. In a system that provides at-least-once delivery (which all practical message brokers do), designing every write operation to be idempotent is non-negotiable. Use unique request identifiers so the receiving system can recognise and deduplicate retries.
Circuit breakers and retry with backoff
When a downstream system is failing, continuing to send requests makes things worse. A circuit breaker tracks failure rates and stops sending requests for a cooldown period. This prevents cascade failures where one system's outage takes down every connected system. Pair circuit breakers with exponential backoff on retries: wait 1 second, then 2, then 4, then 8. Without backoff, your retry logic becomes a denial-of-service attack against a system that is already struggling.
Dead letter queues
When a message fails processing after all retry attempts, it goes to a dead letter queue rather than being discarded. They are the safety net that means a transient failure at 3am does not result in permanently lost data. Monitor your dead letter queues: an empty queue is healthy, a growing queue means something is wrong upstream.
Rate limiting and backpressure
Most APIs enforce rate limits. Your integration needs to respect them gracefully, not hammer the API until it blocks you. Implement rate limiting on your side, queue requests that exceed the limit, and process them at a sustainable pace.
Integration anti-patterns to watch for:
- Writing to two systems in the same HTTP request without a queue between them (dual writes)
- No idempotency key on write operations, so retries create duplicate records
- No field ownership declaration, so two systems overwrite each other's changes
- No dead letter queue, so failed messages vanish silently
- Polling an API every 10 seconds for changes when a webhook or CDC approach would eliminate the load
- Validating HTTP status codes but not business outcomes (a 200 response does not mean the data was processed correctly)
Choosing the right integration middleware
If you are connecting more than three systems, you probably need middleware. The choice depends on your throughput requirements, your team's skills, and how much operational overhead you are willing to accept.
Ask these questions when deciding: How many messages per second? Do you need event replay? Who will operate it? What is your budget model? At moderate throughput, any option works. At very high volumes, Kafka or a cloud-native service becomes necessary. If you do not have a DevOps team, cloud-native services remove that burden.
Testing and debugging integrations in production
Integration code that works in a test environment and fails in production is the norm, not the exception. The differences between environments (network latency, rate limits, data volumes, concurrent users) mean that unit tests and staging deployments only cover part of the risk. Integration testing is its own discipline.
Contract testing
Consumer-driven contract tests (using tools like Pact) verify that two systems agree on the shape of data they exchange. The consumer defines what it expects. The provider verifies it can deliver that shape. If either side changes its schema, the contract test fails before the change reaches production. This is the most reliable way to catch schema drift early, and it works even when the other system is a third-party API you do not control.
Correlation IDs and distributed tracing
Every message entering your integration layer should carry a correlation ID: a unique identifier that follows it through every queue, consumer, API call, and database write. When a message fails three systems downstream, the correlation ID lets you trace its entire path without guessing. Attach the correlation ID to log entries, dead letter queue messages, and audit trail records. Without it, debugging a multi-system failure means manually correlating timestamps across separate log streams.
Observability
Every integration needs a dashboard showing four metrics: throughput (messages per minute), latency (time from publish to consume), error rate (percentage of failed messages), and queue depth (how far behind consumers are falling). A growing queue depth means consumers cannot keep up. A rising error rate means something upstream has changed. An empty dead letter queue is healthy. A growing one means failures are accumulating without human attention. Connect these dashboards to your infrastructure monitoring so integration health is visible alongside application health.
What a good integration project process looks like
A consistent approach to system integration works because the problems are consistent, regardless of the specific systems involved.
Every field mapping, every transformation rule, and every error handling decision is documented. Your team can maintain and extend the integration without depending on us indefinitely.
Connect your systems
If your business is running on disconnected systems and the manual workarounds are slowing you down, we should talk. Our integration service covers the full picture: mapping your data flows, choosing the right patterns, building the connections, and maintaining them as your systems evolve. We will map out the right integration patterns for your specific setup and show you what a connected system looks like.
Talk through your integration needs →