FIM2010–Preloading Data to Avoid Long Synchronizations

It happens all the time, we have a set of data that is being synchronized on our dataset but then a change occurs. The change is fairly significant in that it is the addition of another attribute (or removal for that matter) to existing objects that needs to be synchronized to one or other systems.

The synchronization service is already chugging along happily and efficiently working on the current rule set. The addition of the new attribute however, will cause a couple things to occur. The first is the requirement for the full import and full synchronization on the MA where the data is coming from and the second is the export for ALL the objects that have data in the new attribute to the target systems.

When dealing with the FIM MA, this can be a significant delay and I strongly recommend that if possible pre-populate the data using an out-of-band process. This will allow the data to be present in the target MA and by prepopulating the data in the target attributes, you can greatly reduce the time for the synchronization of the data through the synchronization engine.

Think about it this way. In the sync engine only approach the following steps would have to occur. Assuming that we’ve done all the work to update the MAs and the attribute flows as required and our dataset is about 100,000 entries.

  • Step 1: Full import on the source MA (100,000 entries)
  • Step 2: Full Synchronization on the Source MA to move the data from the connector space through to the metaverse. (100,000 entries)
  • Step 3: Export to Target MAs (we’ll say that we only have to MAs which means that only 100,000 exports have to occur).
  • Step 4: Import of Target MA. Regardless of whether we do a delta or full import here, we modified the 100,000 entries therefore, we’ll have 100,000 imports.

If we pre-stage the data into the target MA, we cut out a fairly large portion of the export work which is by far one of the higher cost operations from a time perspective. To prepare the environment, the attributes would have to be present in the target systems. Note that in some cases you may have to do a schema refresh or you may end up with app-store-exceptions (more about that in a later post). But preloading the target system removes an entire step, the export thereby saving us the processing of all those entries.

  • Step 1: Full import on the source MA (100,000 entries)
  • Step 2: Full Synchronization on the Source MA to move the data from the connector space through to the metaverse. (100,000 entries)
  • Step 3: Import of Target MA. Regardless of whether we do a delta or full import here, we modified the 100,000 entries therefore, we’ll have 100,000 imports.

Anyway, just wanted to remind everyone that a major change in the dataflows does not have to be the equivalent of an initial load. With proper forethought and planning, the time that normal synchronization operations are affected can be minimized considerably.

This entry was posted in Uncategorized. Bookmark the permalink.

7 Responses to FIM2010–Preloading Data to Avoid Long Synchronizations

  1. Chris Clayton says:

    Great blog! I always see something in your posts to get me thinking.

    In my experience it is often the export that takes the least amount of time out of those steps, depending on the MA (FIM MA being the slowest of the slow, though, and it would certainly justify your suggestion).

    If you can do it, it is nice to have two different sync engines running. New rules can be staged and sync’ed on the backup server while the primary runs the existing rules and provisioning, and then you switch. Of course there is expense and effort in maintaining two, but for those with high expectations of rapid disaster recovery (think password synchronization) it is a good scenario. It is best not to have two sync engines exporting to the same data source (i.e. AD) at the same time, but if the only thing the backup server queues up are your newly-managed attributes, it should work.

    I have yet to figure out how to keep two independently-operationg FIM Portals in sync with one another. I only have this running with ILM 2007. Same rules and same source data mean the same outcome (mostly) in two different metaverses.

    • Hi Chris,

      Thanks for the kudos on the blog. I appreciate the feedback. Makes the time writing the posts worthwhile.

      I agree the secondary synchronization engine can certainly be a solution however, I’m not fond of having two FIM MA’s pointed at the same FIM instance. In regards to your slowness comment on the Exports, the FIM MA is certainly the target of this discussion as within the new paradigm, the service is more central to the system operation than the synchronization engine as data manipulation occurs there via workflow and user interaction.

      Have a good day!


      • Chris Clayton says:

        Definitely, two FIM MAs connecting to the same service instance doesn’t sound good at all, and maintaining two independently-operating FIM Service instances with the same configuration and data (what I intended) is troublesome in ways the sync engine never was. When I get that far in FIM deployment, my DR plan will have to radically alter.

      • Hi Chris,

        The independently operating FIM Services seems interesting. Why independantly operating and not a shared common database? That is how I’ve deployed in a few cases and it works fine. That way a configuration change to one site automatically applies to both (once appropriate caching timeouts apply or the service is manually restarted).



  2. Chris Clayton says:

    Hey Blain, I didn’t mean to hijack your post with this thread, but I’m happy to wade deeper since you asked…

    The main reason for my second ILM server is disaster recovery. Only recently have I really leaned on it as a way to roll big changes into production. Seeing how well that worked makes me want to maintain that ability in the future, especially given a 24/7 1-hour turnaround provisioning SLA with ~150k objects growing at 30-50k/year.

    If two FIM Service boxes are relying on the same database, it is a single point of failure…no quick way to switch over to a secondary site with an operational portal. Perhaps having a log-shipped backup service database ready to go would allow the DR FIM installation to come online relatively quickly, but it wouldn’t allow the staging of design changes in parallel with production and then a rapid switchover (with an easy back-out plan of switching back) because the two FIM installations are actually linked together.

    When I get to the point of implementing FIM, the FIM MA and creating a DR environment for it, I have to account for the fact that the FIM MA is not authoritative for new account data (comes from our ERP system) but it must be the first to synchronize a new idenitity (if it has somehow come into the FIM Service database from the other metaverse) since one projected into the second metaverse by any other MA would have a different metaverse ID, the pre-existing FIM MA connector space object couldn’t join to it, and I’d end up with duplicates. Perhaps the ERP-linked MA could be set not to project until the DR sync engine became the active/production one and the FIM MA had been updated, imported and sync’ed.

    Like I said, your posts get me thinking! -Chris

    • Chris,

      You may have already posted something that will answer my question. If so.. Please point me in the right direction. I saw your post here Define FIM/ILM Run Profile Strategies: ( Can’t find part 2 or 3 ) but I’d like to know the “proper” strategy when 3 MA’s are involved HR, FIM, AD

      • Hi There,

        The key thing about run scripts is that you need to see what actions need to occur in what order. This takes a bit of work in simply understanding the processes that have been implemented on the synchronization engine. For example, using the 3 MA’s you provided HR, FIM and AD.
        1. The HR MA provides all the information for new employees.
        2. The FIM MA uses workflows to set certain other values and assign a synchronization rule (although using filtered outbound sync rules this wouldn’t apply)
        3. The AD MA is simply a provisioned system.

        Assumption is that the data that already existed in the system is joined together and we’re running on a steady state. In this case, we would want to do the following:

        1. HR MA import and synchronization.
        2. FIM MA export to push the data to FIM for processing.
        3. FIM MA import and synchronization to bring the changes back to the synchronization engine.
        4. AD MA export to provision the new entries or update the existing joined entries.
        5. AD MA import to close the export loop. (I generally run a sync here as well)

        But another scenario may have the HR MA and AD MA providing data to FIM so therefore, the sequence would change again.

        Hope this helps.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s