Syncing Collections Using Delta Files
Collections help organize products for commerce channels and shoppers by using category source inclusions/exclusions and attribute-based rules.
Product Catalog maintains collection assignments through two background jobs, which evaluate and update product assignment based on attribute or rule changes:
Job Name | Function | Event |
---|---|---|
Product Update: Collection Evaluation | Monitors product attribute updates and re-evaluates product eligibility for collections. | pim:jobs.ProductUpdate.CollectionEvaluation:completed pim:jobs.ProductUpdate.CollectionEvaluation:failed |
Collection Update: Product Evaluation | Monitors collection rule updates and re-evaluates products based on the revised collection rules. | pim:jobs.CollectionUpdate.ProductEvaluation:completed pim:jobs.CollectionUpdate.ProductEvaluation:failed |
If you are subscribed to Product Catalog’s Collections webhook events, the system synchronizes product updates using delta files. The following diagram shows the sequence of tasks performed during the synchronization process:
The system runs background jobs that generate product update files. Depending on the job status, an event is triggered.
Synchronization Process
1. Job execution
The system runs one of two background jobs:
- Product Update: Collection Evaluation (re-evaluates product fitment based on attribute updates).
- Collection Update: Product Evaluation (re-evaluates products based on collection rule changes).
2. Event triggering
Once a job completes successfully or fails, it triggers an event. The webhook listener must be subscribed to both completed and failed events because failed jobs may still update some products.
3. Event delivered
The fabric event publisher delivers the event to the webhook listener.
4. Extract the output file ID
The Webhook Listener retrieves the first output file ID from the event payload. If multiple output files exist:
- Only the first file should be processed.
- Additional files should be reviewed to determine if they contain duplicates, sequential data, or logs.
Example payload with output files
5. Retrieve the file download link
Make an API request to fetch the file download link using the extracted file ID.
Example cURL request
6. Download and extract the file
Download the output file from the retrieved link. The downloaded file is a .zip archive that must be extracted before use.
Retry logic:
- If the download fails, retry with exponential backoff.
- Verify file integrity before extraction.
Example signed URL for file download
This signed URL is a temporary, pre-authenticated link that allows you to download the generated output file. The URL includes security parameters such as an expiration time X-Amz-Expires
, an algorithm used for signing X-Amz-Algorithm
, and a signature X-Amz-Signature
.
Ensure that the webhook listener processes the file right after retrieval to avoid needing multiple API requests.
If the URL expires before the file is downloaded, request a new signed URL by calling the File API again.
7. Process the extracted data
If no products were updated, the file contains the message “No products updated.”
If products were updated, the file contains JSON entries listing:
- Product ID
- SKU
- Type (item, variant, or bundle)
- Status (live, draft)
- Collections the product was added to or removed from
Category attributes are inherited by default, meaning sub-collection nodes automatically adopt attribute values from their parent categories. However, users can override these values as needed, allowing for customization at different hierarchy levels.
Example JSON entries
8. Synchronize the updated data
Process the extracted product data for further synchronization within the system to ensure that any updates correctly propagate through downstream services.
Was this page helpful?