David Shifflet's Snippets

Mindset + Skillset + Toolkit = Success




< Back to Index

C# XmlSerializer and Versioning

The C# XmlSerializer let's you easily create an instance of an object and save it to XML. Usually this is a file. To use it you do something like:

public class SampleFile
{
	public string Name { get; set; }
	public List<string> Warnings { get; set; }
}

...

	// Write it to a file		
	var serializer = new XmlSerializer(typeof(SampleFile));
	using (var sw = new StreamWriter(TestFiles[0].Create()))
	{		
		var file = new SampleFileV1()
		{
			Name = "Dave",
			Warnings = new List() { "A", "B", "C" }
		};
		serializer.Serialize(sw, file);
	}
	
...

	// Read from a file
	var serializer = new XmlSerializer(typeof(SampleFile));
	using (var sr = new StreamReader(TestFiles[0].OpenRead()))
	{
		var file = (SampleFile) serializer.Deserialize(sr);
	}			
And the serialized XML will look like:
<?xml version="1.0" encoding="utf-8"?>
<SampleFile xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Name>Dave</Name>
  <Warnings>
    <string>A</string>
    <string>B</string>
    <string>C</string>
  </Warnings>
</SampleFile>

The Problem

The XmlSerializer will work fine if you are adding new fields or removing fields, but if you are changing the types over time it's not going to work.

For example let's say that after a year someone decides they want to change SampleFile.Warnings to store something other than a collection of strings. Something like:

public class SampleFileWarning
{
	public string Priority { get; set; }
	public string Name { get; set; }
}

public class SampleFile
{
	public string Name { get; set; }
	public List<SampleFileWarning> Warnings { get; set; }
}	
And the serialized XML will look like:
<?xml version="1.0" encoding="utf-8"?>
<SampleFile xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xsi:type="SampleFile">
  <Name>Dave</Name>
  <Warnings>
    <SampleFileWarning>
      <Priority>Default</Priority>
      <Name>A</Name>
    </SampleFileWarning>
    <SampleFileWarning>
      <Priority>Default</Priority>
      <Name>B</Name>
    </SampleFileWarning>
    <SampleFileWarning>
      <Priority>Default</Priority>
      <Name>C</Name>
    </SampleFileWarning>
  </Warnings>
</SampleFile>
While we can deserialize new files serialized after this change we won't be able to deserialize the old files with the SampleFile.Warnings where it is List<string>. This is a problem.

The solution is to create different classes representing the different versions something like:

public class SampleFileV1
{
	public string Name { get; set; }
	public List<stringgt; Warnings { get; set; }
}

public class SampleFileV2
{
	public string Name { get; set; }
	public List<SampleFileWarninggt; Warnings { get; set; }
}

public class SampleFileWarning
{
	public string Priority { get; set; }
	public string Name { get; set; }
}
So this will work and the data will get serialized but there is one major problem.

The developers will need to know which version of the data they are using and every time it changes they need to change the names from SampleFileV1 to SampleFileV2 to SampleFileV3 etc. Why can't we use SampleFile anymore?

The Solution

We are going to need our own serializer. We aren't going to write a whole new XmlSerializer we are just going to wrap the XmlSerializer. This is going to let us:

  • Reduce the number of change points in case we need to modify the serialization in the future.
  • Developers will be able to use SampleFile instead of the specific versions.
  • Using SampleFile instead of the specific versions should let us know about problems with the class and the version changing at compile time.
  • Deserialize the older versions to the current version of SampleFile
So lets get started with our models:
// THE TYPE FOR DEVELOPERS
public class SampleFile : SampleFileV2
{
	// LEAVE EMPTY
}

[XmlInclude(typeof(SampleFile))]
public class SampleFileV2
{
	public string Name { get; set; }
	public List Warnings { get; set; }
}

public class SampleFileV1
{
	public string Name { get; set; }
	public List Warnings { get; set; }

	// The translation (Casting)
	public static implicit operator SampleFile(SampleFileV1 file)
	{
		var result = new SampleFile
		{
			Name = file.Name,
			Warnings = new List()
		};

		foreach (var warning in file.Warnings)
		{
			result.Warnings.Add(new SampleFileWarning()
			{
				Name = warning,
				Priority = "Default"
			});
		}
		return result;
	}
}

public class SampleFileWarning
{
	public string Priority { get; set; }
	public string Name { get; set; }
}
Things to notice:
  • SampleFile is basically empty. But it inherits from the current version. This is so we can cast SampleFile to the current version.
  • The current version SampleFileV2 has the attribute [XmlInclude(typeof(SampleFile))], this lets the XmlSerializer know about the SampleFile type. The current version should always have this attribute.
  • The older version SampleFileV1 has an implicit cast that lets the older version be cast to the current version SampleFileV2. When versions are deprecated be sure to write a cast that can translate to the current version.

Now that we have our models let's look at the serializer.

public class SampleFileSerializer
{
	public SampleFile Deserialize(StreamReader reader)
	{            
		var type = GetXmlType(reader);
		
		var serializer = new XmlSerializer(type);
		var obj = serializer.Deserialize(reader);

		var caster = type.GetMethod("op_Implicit");
		if (caster != null)
		{                
			return (SampleFile) caster.Invoke(obj, new [] { obj});
		}
		return (SampleFile) obj;
	}

	public void Serialize(StreamWriter sw, SampleFile file)
	{
		var type = GetCurrentSpecificVersion();

		if (type == null)
		{
			throw new InvalidOperationException("Unable to find current specific version");
		}

		new XmlSerializer(type).Serialize(sw, file);
	}

	private Type GetCurrentSpecificVersion()
	{
		foreach (var type in Assembly.GetExecutingAssembly().GetTypes())
		{
			if (type.IsAssignableFrom(type) && typeof(SampleFile) != type)
			{
				return type;
			}
		}
		return null;
	}

	private Type GetXmlType(StreamReader streamReader)
	{
		try
		{
			using (var reader = XmlReader.Create(streamReader))
			{
				while (reader.Read())
				{
					if (reader.IsStartElement())
					{
						var assembly = Assembly.GetExecutingAssembly();
						var lookFor = string.Format("{0}.{1}", assembly.GetName().Name, reader.Name);
						return assembly.GetType(
							lookFor);
					}
				}
			}
			return null;
		}
		finally
		{
			streamReader.BaseStream.Position = 0;
		}
	}
}
Let's look at each method:
  • Deserialize(...) - During deserialization look inside the XML file determining the type from "<SampleFileV1/>". Then deserialize to that XML's version then cast to the current version and return the object.
  • Serialize(...) - Get what type represents the current version and then serialize the object as that type. In the example code a SampleFile will be serialized as SampleFileV2.
  • GetCurrentSpecificVersion(...) - This gets the current specific version, basically it looks at SampleFile and detects what it inherits from. In the above code that is SampleFileV2.
  • GetXmlType(...) - This looks inside of the XML file for the first start element. When the XmlSerializer serializes the objects the first element is usually the type.
And how to use this:

class Program
{
	private static readonly FileInfo[] TestFiles = 
	{
		new FileInfo("v1.xml"),
		new FileInfo("v2.xml")
	};

	static void Main(string[] args)
	{
		CreateTestFiles();
		SerializeAndDeserialize();
	}

	static void CreateTestFiles()
	{
		using (var sw = new StreamWriter(TestFiles[0].Create()))
		{
			var serializer = new XmlSerializer(typeof(SampleFileV1));
			var file = new SampleFileV1()
			{
				Name = "Dave",
				Warnings = new List<string>() { "A", "B", "C" }
			};
			serializer.Serialize(sw, file);
		}
	}

	static void SerializeAndDeserialize()
	{
		//Confirm TestFiles[0] is SampleFileV1
		var testSerializer = new XmlSerializer(typeof(SampleFileV1));
		using (var sr = new StreamReader(TestFiles[0].OpenRead()))
		{                
			var testFile = (SampleFileV1) testSerializer.Deserialize(sr);
			if (testFile.GetType() != typeof(SampleFileV1))
			{
				throw new InvalidOperationException("testFile is not SampleFileV1");
			}
		}

		// Version 1 as Version 2
		var serializer = new SampleFileSerializer();
		SampleFile file;
		// Read a version 1 file
		using (var sr = new StreamReader(TestFiles[0].OpenRead()))
		{
			file = serializer.Deserialize(sr);
			CheckWarnings(file);
			ConfirmBaseType(file, typeof(SampleFileV2));
		}

		//Save it as version 2 file
		using (var sw = new StreamWriter(TestFiles[1].Create()))
		{
			serializer.Serialize(sw, file);
		}

		// Read a version 2 file
		using (var sr = new StreamReader(TestFiles[1].OpenRead()))
		{
			var versionFile = serializer.Deserialize(sr);                
			CheckWarnings(versionFile);
			ConfirmBaseType(file, typeof(SampleFileV2));
		}
	}

	private static void ConfirmBaseType(SampleFile file, Type t)
	{
		if (file.GetType().BaseType != t)
		{
			throw new InvalidOperationException(string.Format("Expecting {0} got {1}",
				t, file.GetType().BaseType));                    
		}
	}

	private static void CheckWarnings(SampleFile file)
	{
		if (file.Warnings.Count != 3)
		{
			throw new InvalidOperationException("Warnings should be 3");
		}
	}
}
So let's break down the methods:
  • Main() - The entry point for the console application.
  • CreateTestFiles() - Creates a file we can use for the test as a SampleFileV1.
  • SerializeAndDeserialize(...) - This deserializes a SampleFileV1, serializes it as a SampleFile the type in the XML will be SampleFile2 which is the current version.
  • ConfirmBaseType(...) - This confirms that the BaseType of the SampleFile is the type we are expecting.
  • CheckWarnings(...) - This confirms that the SampleFile has three warnings.
So basically we are serializing the first version, deserializing via the SampleFileSerializer, serializing it via the SampleFileSerializer, and then deserializing via the SampleFileSerializer. Along the way we are confirming we have the right types and we are checking to make sure that the thing we changed SampleFile.Warnings still has the data.

I urge you to look at the sample code and step through it to see what is happening.

The other problem

If we have a bunch of version one files that look like:

<?xml version="1.0" encoding="utf-8"?>
<SampleFile xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Name>Dave</Name>
  <Warnings>
    <string>A</string>
    <string>B</string>
    <string>C</string>
  </Warnings>
</SampleFile>
We are going to need to change the type to be version specific so the SampleFileSerializer can deal with it. You would need to change them to:
<?xml version="1.0" encoding="utf-8"?>
<SampleFileV1 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Name>Dave</Name>
  <Warnings>
    <string>A</string>
    <string>B</string>
    <string>C</string>
  </Warnings>
</SampleFile>

Summary

There are a lot of ways to tackle this problem. This is just one of them. If you are just adding or removing properties to a class you may not need to do any of this.

The XmlSerializer is very flexible. If you are changing the types on the properties you are going to need to do something. No matter what way you tackle this I urge you to consider wrapping the XmlSerializer so you reduce change points in the future.

If you are using binary serialization look at the functionality it provides. For example there are callbacks that will be called before or after deserialization. It also has attributes for VersionAdded that help with versioning.

If you know of a better approach to solving this issue please let me know via the comments below or by email.

Sample Code