TechOnTip Weblog

Run book for Technocrats

Archive for July, 2014

Enable External GalSync Contacts for Lync Address Book

Posted by Brajesh Panda on July 28, 2014

I found this article from http://uccexperts.com/enabling-ad-mail-contacts-for-lync/ & used the same procedures for my MIIS based GalSync Solution. Perfectly works. I just did one correction from the original article & added couple of lines here and there. Solution credit goes to original author. Cheers!!

Situation

While working in an environment with multiple Exchange 2010 forests where Forefront Identity Manager was used to realize a common global address list (GAL). Each forest also has its own Lync 2010 implementation without Enterprise voice. Primarily both environment has two different Lync environment with two different SIP Domain.

By default the Lync address book is automatically populated with all objects that have one of the following attributes filled in:

msRTCSIP-PrimaryUserAddress

telephoneNumber

homePhone

mobile

In case msRTCSIP-PrimaryUserAddress attribute is missing, Lync will not able to show presence info for the contact & it may just show a phone icon instead of a person icon/picture.

By default the FIM GalSync synchronizes all those attributes, except the msRTCSIP-PrimaryUserAddress. This caused contacts in the remote forest to appear in the address book with a telephone icon:


This situation caused confusion for our users because they expect the Lync client to work for Instant Messaging with Lync users in the remote forests. When they try to start an IM session with a remote forest user Outlook starts and will created a new e-mail message.

Note: If you see phone icon for those users, make sure to test Federation using their SIP Address directly rather than, default AD Objects. You can add a Lync object to Outlook address book & stamp SIP Address manually & try to test federation.

You can also try out exporting & manually adding/updating this attribute. That should work too. But that will be manual process for future updates too. Using below procedure you can configure GalSync Management Agents to replicate this Lync Attribute too.

Solution

The solution is to include the AD attribute “msRTCSIP-PrimaryUserAddress” in the FIM address list synchronization.

Lab Setup

The overview below depicts my lab setup:


The lab is running Exchange 2010, Lync 2010 and FIM 2010 in a Windows 2008 R2 Active Directory. My environment is MIIS GALSync.

Scope

The scope of this procedure is to add the “msRTCSIP-PrimaryUserAddress” in the local forest to the contact in the remote forest by using the built-in Galsync management agents of FIM 2010. This procedure does not cover the implementation of the Galsync itself.

Presence and instant messaging to the remote forest will only be available when you have Lync Edge servers and federation in place. This procedure focuses on changing the AD attributes so that Lync recognizes the contact as a lync-enabled contact.

PROCEDURE

Step 1: Extend the metaverse schema

  1. Start the Synchronization Service Manager and click Metaverse Designer.
  2. Select person in the Object types pane
  3. Click Add Attribute in the Actions pane

  4. Click New Attribute in the “Add Attribute to object type” windows

  5. Enter the following information in the “New Attribute” windows:

Attribute name

msRTCSIP-PrimaryUserAddress

Attribute type

String (indexable)

Mapping Type

Direct

Multi-valued

Clear check box

Indexed

Clear check box


  1. Click OK
  2. Click OK

Step 2: Configure Management Agent of corporate.nl

  • Start the FIM Synchronization Service Manager Console and select “Management Agents
  • Right click the Management Agent you want to modify and select Properties.
  • Go to the “Select Attributes“section
  • Check the Show All box and select the attribute “msRTCSIP-PrimaryUserAddress“, click OK


  • Return to the properties of the Management Agent and select the section “Configure Attribute Flow
  • Configure this section according to the following table:

Data source object type

user

Metaverse object type

person

Mapping Type

Direct

Flow Direction

Export (allow nulls)

Data source attribute

msRTCSIP-PrimaryUserAddress

Metaverse attribute

msRTCSIP-PrimaryUserAddress


  • Click New
  • Verify this modification by collapsing the following header:

  • Check if the following rule is added:

Step 3: Import modification to the metaverse

  • Right click the management agent you just modified and select Properties
  • Select Run  and do a Full Import and Full Synchronization

Step 4: Verify attribute import

  • Start the FIM Synchronization Service Manager Console and select “Metaverse Search
  • Click “Add clause
  • Enter the following clause:

  • Click “Search
  • In the “Search Results” pane, right click the user with displayname corporate01 and select Properties
  • Confirm that the attribute “msRTCSIP-PrimaryUserAddress” contains a value

  • Click Close

Step 5: Configure Management Agent of company.nl

  • Start the FIM Synchronization Service Manager Console and select “Management Agents
  • Right click the Management Agent you want to modify and select Properties.
  • Go to the “Select Attributes“section
  • Check the Show All box and select the attribute “msRTCSIP-PrimaryUserAddress”, click OK

  • Return to the properties of the Management Agent and select the section “Configure Attribute Flow
  • Configure this section according to the following table:
Data source object type contact
Metaverse object type person
Mapping Type Direct
Flow Direction Export (allow nulls)
Data source attribute msRTCSIP-PrimaryUserAddress
Metaverse attribute msRTCSIP-PrimaryUserAddress


  • Click New
  • Verify this modification by collapsing the following header:

  • Check if the following rule is added:

Step 6: Export modification to the remote forest

  • Right click the management agent you just modified and select Properties
  • Select Run  and do an Full Import and Full Synchronization
  • Right click the management agent you just modified and select Properties
  • Select Run  and do an Export

Step 7: Verify attributes in remote forest

  • Start Active Directory Users And Computers and enable the Advanced features
  • Go to the OU where the FIM Galsync creates the contacts
  • Double click the contact “corporate01” and go the the Attribute Editor

  • Confirm that the attribute “msRTCSIP-PrimaryUserAddress” contains a value.

What does it look like in the Lync client ?

If I log in as  user company01 and we can see the following result in the Lync client:

In the screenshot above the users in the remote forest have a status of “Presence Unknown”. This is because I did not have Edge servers implemented in my test environment.

If you have implemented Lync Edge servers and you have your Lync federations between both organizations in place, the presence will be shown for the contacts as if they were users in the local Lync organization.

Posted in Mix & Match | 17 Comments »

SSD Caching versus Tiering

Posted by Brajesh Panda on July 10, 2014

BY TEKINERD, ON NOVEMBER 8TH, 2010

http://tekinerd.com/2010/11/ssd-caching-versus-tiering/

In some recent discussions, I sensed there is some confusion around solid state device (SSD) storage used as a storage tier vs, a cache. While there are some similarities and both are intended to achieve the same end result i.e. acceleration of data accesses from slower storage, there are some definite differences which I thought I’d try to clarify here. This is from my working viewpoint here, so please do post comments if  you feel differently.

Firstly, SSD caching is temporary storage of data in an SSD cache whereas true data tiering is classed as a semi-permanent movement of data to or from an SSD storage tier. Both are based on algorithms or policies that ultimately result in data being copied to, or removed from, SSDs. To clarify further, if you were to unplug or remove your SSDs, for the caching case, the user data is still stored in the primary storage behind the SSD cache and is still from the original source (albeit slower) whereas in a data tier environment, the user data (and capacity) is no longer available if the SSD tier were removed as the data was physically moved to the SSDs and most likely removed from the original storage tier.

Another subtle difference between caching and teiring is if the SSD capacity is visible or not. In a cached mode, the SSD capacity is totally invisible i.e. the end application simply sees the data accessed much faster if it has been previously accessed and is still in cache store (i.e. a cache hit). So if a 100G SSD cache exists in a system with say 4TB of hard disk drive (HDD) storage, the total capacity is still only 4TB i.e. that of the hard disk array, with 100% of the data always existing on the 4TB with copies only of the data in the SSD cache based on the caching algorithm used. In a true data tiering setup using SSDs, the total storage is 4.1TB and though this may be presented to a host computer as one large virtual storage device, part of the data exists on the SSD and the remainder on the hard disk storage. Typically, such small amounts of SSD would not be implemented as a dedicated tier, but you get the idea if say 1TB of SSD storage was being used in a storage area network system of 400TB of hard drive based storage creating 401TB of usable capacity.

So how does data make it into a cache versus a tier? Cache and block level automated data tiering controllers monitor and operate on statistics gathered from the stream of storage commands and in particular the addresses of the storage blocks being accessed.

SSD Caching Simplified

Caching models typically employ a lookup table method based on the block level address (or range of blocks) to establish if the data the host is requesting has been accessed before and potentially exists in the SSD cache. Data is typically moved more quickly into an SSD cache versus say tiering where more analysis of the longer term trend is typically employed which can span hours if not days in some cases. Unlike DRAM based caches however where it is possible to cache all reads, a little more care and time is taken with SSDs to ensure that excessive writing to the cache is avoided given the finite number of writes an SSD can tolerate. Most engines use some form of “hot-spot” detection algorithm to identify frequently accessed regions of storage and move data into the cache area once it has been established there is a definite frequent access trend.

Traditional caching involves one of several classic caching algorithms which result in either read-only or read and write caching. Cache algorithms and approaches vary by vendor and dictate how a read from the HDD storage results in a copy of the original data entering the cache table and how long it “lives” in the cache itself. Subsequent reads to that same data who’s original location was on the hard drive can now be sent from the SSD cache instead of the slower HDD i.e. a cache hit (determined using a address lookup in the cache tables). If this is the first time data is being accessed from a specific location on the hard drive(s), then the data must first be accessed from the slower drives and a copy made in the SSD cache if the hot spot checking algorithms deems necessary (triggered by the cache miss).

Caching algorithms often try to use more sophisticated models to pre-fetch data based on a trend and store it in the cache if it thinks there a high probability it may be accessed soon e.g. in sequential video streaming or VMware virtual machine migrations where it is beneficial to cache data from the next sequential addresses and pull them into the cache at the same time as the initial access. After some period of time or when new data needs to displace older or stale data in the cache, a cache flush cleans out the old data. This may also be triggered by the hot spot detection logic determining that the data is now “cold”.

The measure of a good cache is how many hits it gets versus misses. If data is very random and scattered over the entire addressable range of storage with infrequent accesses back to the same locations, then the effectiveness is significantly lower and sometimes detrimental to overall performance as there is an overhead in attempting to locate data in the cache on every data access.

SSD Auto Tiering Basics

An automated data tiering controller treats the SSD and HDDs as two separate physical islands of storage, even if presented to the host application (and hence the user) as one large contiguous storage pool (a virtual disk). A statistics gathering or scanning engine collects data over time and looks for data access patterns and trends that match a pre-defined set of policies or conditions. These engines use a mix of algorithms and rules that indicate how and when a particular block (or group of blocks) of storage is to be migrated or moved.

The simplest “caching like” approach used by a data tiering controller is based on frequency of access. For example, it may monitor data blocks being accessed from the hard drives and if it passes a pre-defined number of accesses per hour “N” for a period of time “T”, then a rule may be employed that says when N>1000 AND T>60 minutes, move the data up to the next logical tier. So if data being accessed a lot from the hard drive and there are only two tiered defined, SSD being the faster of the two, the data will be copied to the SSD tier (i.e. promoted) and the virtual address map that converts real time host addresses to the physical updated to point data to the new location in SSD storage. All of this of course happens behind a virtual interface to the host itself who has no idea the storage just moved to a new physical location. Depending on the tiering algorithm and vendor, the data may be discarded on the old tier to free up capacity. The converse is also true. If data is infrequently accessed and lives on the SSD tier, it may be demoted to the HDD tier based on similar algorithms.

More sophisticated tiering models exist of course, some that work at file layers and look at the specific data or file metadata to make more intelligent decisions about what to do with data.

Where is SSD Caching or Tiering Applied?

Typically, SSD caching is implemented as a single SATA or PCIe flash storage device along with an operating system driver layer software in a direct attached storage (DAS) environment to speed up Windows or other operating system accesses. In much larger data center storage area networks (SAN) and cloud server-storage environments, there are an increasing number of dedicated rackmount SSD storage units that can act as a transparent cache at LUN level where the caching is all done in the storage area network layer, again invisible to the host computer. The benefit of cache based systems are that they can be added transparently and often non-disruptively (other than the initial install). Unlike with tiering, there is no need to setup dedicated pools or tiers of storage i.e. they can be overlaid on top of an existing storage setup.

Tiering is more often found in larger storage area network based environments with several disk array and storage appliance vendors offering the capability to tier between different disk arrays based on their media type or configuration. Larger tiered systems often also use other backup storage media such as tape or virtual tape systems. Automated tiering can substantially reduce the management overhead associated with backup and archival of large amounts of data by fully automating the movement process, or helping meet data accessibility requirements of government regulations. In many cases, it is possible to tier data transparently between different media types within the same physical disk array e.g. a few SSD drives in RAID 1 or 10, 4-6 SAS drives in a RAID 10 and 6-12 SATA drives in a RAID  i.e. 3 distinct tiers of storage. Distributed or virtualizaed storage environments also offer either manual or automated tiering mechanisms that work within their proprietary environments. At the other end of the spectrum, file volume manager and storage virtualization solutions running on the host or in a dedicated appliance can allow IT managers to organize existing disk array devices of different types and vendors and sort them into tiers. This is typically a process that requires a reasonable amount of planning and often disruption, but can yield tremendous benefits once deployed.

SSD Tiering versus Caching: Part 2

 

BY TEKINERD, ON AUGUST 14TH, 2011

http://tekinerd.com/2011/08/ssd-tiering-versus-caching-part-2/

A while back I wrote about some of the differences between caching and tiering when using solid state disk (SSD) drives in a PC or server.

Having just returned from the 2011 Flash Memory Summit in Santa Clara, I feel compelled to add some additional color around the topic given the level of confusion clearly evident at the show. Also I’d like to blatantly plug an upcoming evolution in tiering, called MicroTiering from our own company, Enmotus which emerged from stealth at the show.

The simplest high level clarification that emerged from the show I’m glad to say matched what we described in our earlier blog (SSD Caching versus Tiering): caching makes a copy of frequently accessed data from a hard drive and places it in the SSD for future reads, whereas tiering moves the data permanently to the SSD and it’s no longer stored on the hard drive. Caching speeds up reads only at this point with a modified caching algorithm to account for SSD behavior versus RAM based schemes, whereas tiering simply maps the host reads and writes to appropriate storage tier with no additional processing overhead. So in teiring, you get the write advantage and of lesser benefit, the incremental capacity of the SSD which becomes available to the host as usable storage (minus some minor overheads to keep track of the mapping tables).

Why the confusion? One RAID vendor in particular, along with several caching companies, are calling their direct attached storage (or DAS) caching solution “tiering”, even though they are only caching the data to speed up reads and data isn’t moved. Sure write based caching is coming, but it’s still fundamentally a copy of the data that is on the hard drive not a move and SSD caching algorithms apply.

Where Caching is Deployed

SSD caching has a strong and viable place in the world of storage and computing at many levels so it’s not a case of tiering versus caching, but more when to use either or both. Also, caching is relatively inexpensive and will most likely end up bundling for free in PC desktop applications with the SSD you are purchasing for Windows applications for example, simply because this is how all caching ends up i.e. “free” with some piece of hardware, an SSD in this case. Case in point is Intel and Matrix RAID, which has now been enhanced with it’s own caching scheme called Smart Response Technology (SRT) currently available for Z68 flavor motherboards and systems.

In the broader sense, we are now seeing SSD caching deployed in a number of environments:

  • Desktops (eventually notebooks with both SSD and hard drives) bundled with SSDs or as standalone software e.g. Intel SRT and Nvelo (typically Windows only)
  • Server host software based caching e.g. FusionIO, IOturbine, Velobit (Windows and VMware)
  • Hardware PCIe adapter based server RAID SSD caching e.g. LSI’s CacheCade (most operating systems)
  • SAN based SSD caching software, appliances or modules within disk arrays e.g. Oracle’s ZFS caching schemes (disk arrays) or specialist appliances that transparently cache data into SSDs in the SAN network.

Where Data Tiering is Deployed

Tiering is still fundamentally a shared SAN based storage technology used in large data sets. In its current form, it’s really an automated way to move data to and from slow, inexpensive bulk storage (e.g. SATA drives, possibly even tape drives) to fast, expensive storage based on its frequency of access or “demand”. Why? So data managers can keep expensive storage costs to a minimum by taking advantage of the fact that typically less than 20% of data is being accessed over any specific period of time. Youtube is a perfect example. You don’t want to store a newly uploaded video and keep it stored on a large SSD disk array just in case it becomes highly popular versus the other numerous uploads. Tiering automatically identifies that the file (or more correctly a file’s assocatied low level storage ‘blocks’) is starting to increase in popularity, and moves it up to the fast storage for you automatically. Once on the higher performance storage, it can handle a significantly higher level of hits without causing excessive end user delays and the infamous video box ‘spinning wheel’. Once it dies down, it moves it back making way for other content that may be on the popularity rise.

Tiering Operates Like A Human Brain

The thing I like about teiring is that it’s more like how we think as humans i.e. pattern recognition over a large data set, with an almost automated and instant response to a trend rather than looking at independent and much smaller slices of data as with caching. A tiering algorithm observes data access patterns on the fly and determines how often and more importantly, what type of access is going on and adapts accordingly. For example, it can determine if an access pattern is random or sequential and allocate storage to the right type of storage media based on it’s characteristics. A great “big iron” example solution is EMC’s FAST, or the now defunct Atrato.

Tiering can also scale better to multiple levels of storage types. Whereas caching is limited to either RAM, single SSDs or tied to a RAID adapter, tiering can operating on multiple tiers of storage from a much broader set up to and including cloud storage (i.e. a very slow tier) for example.

MicroTeiring

At the show, I introduced the term MicroTiering, one of the solutions our company Enmotus will be providing in the near future. MicroTiering is essentially a direct attach storage version of its SAN cousin but applied on the much smaller subset of storage that is inside the server itself. It’s essentially a hardware accelerated approach to teiring at DAS level that doesn’t tax the host CPU and facilitates a much broader set of operating systems and hypervisor support versus the narrow host SSD caching only offerings we see today that are confined to just a few environments.

Tiering and Caching Together

The two technologies are not mutually exclusive. In fact, it is more than likely that tiering and caching involving SSDs will be deployed together as they both provide different benefits. For example, caching tends to favor the less expensive MLC SSDs as the data is only copied and handles the highly read only transient or none critical data, so loss of the SSD cache itself is none critical. It’s also the easiest way to add a very fast, direct attached SSD cache to your sever provided your operating system or VM environment can handle it.

On the other hand, as tiering relocates the data to the SSD, SLC is preferable for it’s higher performance on reads and writes, higher resilience and data retention characteristics. In the case of DAS based tiering solutions like MicroTiering, it is expected that tiering may also be better suited to virtual machine environments and databases due to it’s inherent and simpler write advantage, low to zero host software layers and VMware’s tendencies to shift the read-write balance more toward 50/50.

What’s for sure, lots of innovation and exciting things still going on this space with lots more to come.

Posted in Mix & Match | Leave a Comment »

PCIe Flash versus SATA or SAS Based SSD

Posted by Brajesh Panda on July 10, 2014

BY TEKINERD, ON SEPTEMBER 2ND, 2010

http://tekinerd.com/2010/09/pcie-flash-versus-sata-or-sas-based-ssd/

The impressive results being presented by the new PCIe based server or workstation add-in card flash memory products hitting the market from the likes of FusionIO and others are certainly pushing up the performance envelope of many applications, especially in transactional database applications where the number of user requests is directionally proportional to the storage IOPs or data throughput capabilities.

In just about all cases, general purpose off the shelf PCIe SSD devices all present themselves as a regular storage device to the server e.g. in Windows, they appear as a SCSI like device that can be configured in the disk manager as regular disk volume (e.g. E: or F:). The biggest advantage PCIe SSDs have over standalone SATA or SAS SSD drives is that they can handle greater data traffic throughput and I/Os as they use the much faster PCIe bus to connect directly to multiple channels of flash memory, often using a built in RAID capability to stripe data across multiple channels of flash mounted directly on board the add-in card.

To help clear up confusion for some of the readers, the primary differences between PCIe Flash memory and conventional SSDs can be summarized as follows:


Where PCIe Flash Works Well

The current generation of PCIe flash SSDs are best suited to applications that require the absolute highest performance with less of an emphasis on long term serviceability as you have to take the computer offline to replace defective or worn out SSDs. They also tend to work best when the total storage requirements for the application can live on the flash drive. Today’s capacities of up to 320G (SLC) or 640G (MLC) are more than ample for many database applications, so placing the entire SQL database on the drive is not uncommon. Host software RAID 1 is typically used to make the setup more robust but starts to get expensive as high capacity PCIe SSD cards run well north of $10,000 retail, the high price typically a result of the extensive reliability and redundancy capability of the card’s on-board flash controller. As the number of PCIe flash adapter offerings grow and the market segments into the more traditional low-mid-high product categories and features, expect the average price of these types of products to come down relatively fast.

Where SSDs Work Well

SATA or SAS based SSDs, by design, work pretty much anywhere a conventional hard drive does. For that reason we see laptops, desktops, servers and external disk arrays adopting them relatively quickly. Depending on the PCIe flash being compared to, it can take anywhere from 5-8 SSDs to match the performance of a PCIe version using a hardware RAID adapter which tends to push the overall price higher when using the more expensive SLC based SSDs. So SATA or SAS SSDs tend to be best suited to applications that can use them as a form of cache in combination with a traditional SATA or SAS disk array setup. For instance, it is possible to achieve a similar performance and significantly lower system and running costs using 1-4 enterprise class SSDs and SATA drive in a SAN disk array versus a Fibre Channel or SAS 15K SAN disk array setup. Most disk array vendors are now offering SSD versions of their Fibre Channel, iSCSI or SAS based RAID offerings.

Enterprise Flash Memory Industry Direction

At the Flash Summit we learned that between SSDs and DRAM a new class of storage will appear for computing, referred to as SCM, or storage class memory. Defined as something broader than just ultra fast flash based storage, it does require that the storage be persistent and appear more like conventional DRAM does to the host i.e. linear memory versus a storage I/O controller with mass storage and a SCSI host driver. SCM is expected to enter mainstream servers by 2013.

Posted in Mix & Match | Leave a Comment »

 
%d bloggers like this: