Lokad / ILPack

Serialize .NET Core assemblies

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Type dependency resolution needed

osman-turan opened this issue · comments

We tries to sort types by dependencies prior to actual serialization:

private void CreateTypes(IEnumerable<Type> types)
{
// Sort types by base types.
var sortedTypes = types.TopologicalSort(GetBaseTypes).ToList();
// First, reserve metadata for all types
ReserveTypes(sortedTypes);

Dependency solver method:

/// <summary>
/// Gets all interfaces and base types of a given type including all of its parents.
/// Referenced types from external assemblies are excluded.
/// </summary>
/// <param name="type">Type to be examined.</param>
/// <returns>All interfaces and base types of given type and its parents recursively.</returns>
private IEnumerable<Type> GetBaseTypes(Type type)
{
foreach (var inf in type.GetInterfaces())
{
if (_metadata.IsReferencedType(inf))
{
continue;
}
yield return inf;
foreach (var innerInf in GetBaseTypes(inf))
{
yield return innerInf;
}
}
var baseType = type.BaseType;
if (baseType != null)
{
while (!_metadata.IsReferencedType(baseType))
{
yield return baseType;
baseType = baseType.BaseType;
}
}
}

As you see, this method should fail to handle constructed generic types correctly (e.g. IPage in class jot_2 : ICompiledJot<IPage> if all dependent types and jot_2 would be in the same assembly) and throw cyclic dependency exception. Since, our current unit tests are more complex, I've tried completely removing the sorting by assuming metadata are already sorted for a given assembly and it worked.

Question: Is my assumption correct? If so, I will remove type dependency resolution completely. If not, I'll improve it.

I think you're right.

Since everything in the assembly gets pre-reserverd with AssemblyMetadata.ReserveXXX, the handles for everything are all available even before they're generated. The only catch we pre-reserving is everything must then be generated in exactly the same order, but it does quite nicely solve dependency order problems.

@toptensoftware Even for metadata reservation, we need types sorted by dependencies. Other metadata entities are irrelevant (e.g. fields, properties, constructors, methods etc.) and reservation solves cyclic dependencies for these entities (see: #22). This was the reason why I had implemented type sorting.

But, I observed that types in Assembly object are already sorted. Although I don't have solid reason for it (like enforced by ECMA-335), it makes sense that it would be sorted already for all the time.

Even for metadata reservation, we need types sorted by dependencies

OK. I'm sure you're right, curious why?

@toptensoftware No, there is no guarantee I could be right 😄 This is the reason why I created this issue.

Let's consider following type definitions:

class MyBaseClass
{
}

class MyDerivedClass: MyBaseClass
{
}

Correct sorted types would be:

var sortedTypes = new Type[] {
  typeof(MyBaseClass),
  typeof(MyDerivedClass)
};

During reservation for types, we need to process base types and interfaces (also generic type arguments as well). AssemblyGenerator.ReserveTypeDefinition emits metadata handle for types while reserving type members. So, we can use types in type members later. Relevant parts in AssemblyGenerator.ReserveTypeDefinition method (notice baseTypeHandle):

var typeHandle = _metadata.Builder.AddTypeDefinition(
type.Attributes,
type.DeclaringType == null ? _metadata.GetOrAddString(ApplyNameChange(type.Namespace)) : default(StringHandle),
_metadata.GetOrAddString(type.Name),
baseTypeHandle,
MetadataTokens.FieldDefinitionHandle(offset.FieldIndex + 1),
MetadataTokens.MethodDefinitionHandle(offset.MethodIndex + 1));

If sortedTypes is reversed, the method should fail.

Ah. I thought the type handles were getting reserved up front. Is there a reason that can't be done?

I think, it can be done. But, I need to revisit the source for validation. If my guess is correct (couldn't find any error so far), we don't need to do so.

It looks like when the types are defined (AddTypeDefinition from your above link), we need to have type handles for:

  1. The base type and any implemented interfaces - handled by GetBaseTypes()
  2. The outer type if this is a nested type declaration. We're not checking this, but we could assume outer classes will always be visited first (probably safe assumption, but also probably hard to prove, especially after topographical sort)
  3. Any types referenced by generic parameter constraints (eg: where T : MyOtherClass)
  4. Potential future dependencies

For 2 and 3, GetBaseTypes could be updated to also return those connections for the topographic sort, but I think this has the potential to introduce circular dependencies.

Unless there something I'm missing something, reserving the type handles up front seems a simpler approach.

Hi @osman-turan,

Here's a valid C# example that breaks the current approach.

class Base<T>
{
}

class Derived : Base<Derived>
{
}

Gives a "Cyclic connections are not allowed" error in TopologicalSort()

@osman-turan I am inclined to agree with @toptensoftware. We probably should not need any topological sort, because cyclic dependencies are possible. I just pushed two commented-out tests that illustrate this need. The get-handles-first approach looks correct to me.

PR #105 should resolve this issue.

Unfortunately #105 is not sufficient yet. We should probably register first not only all the type handles, but all the member handles as well, and then proceed. Right now, the test MyCyclicTypes.cs still fail because the inherited class get processed before the base class, causing a handle not to be found for the constructor of the base class.

I'm pretty sure the cyclic dependency problem is solved and we are reserving all members (including constructors) before creating the types. The problem here is in handling closed generics in the GetConstructorHandle function.

It's failing to find constructor handle for:

SandboxSubject.MyBase`1[SandboxSubject.Derived] Void .ctor()

Which should get mapped (via a TypeSpec) to:

SandboxSubject.MyBase`1[T] Void .ctor()

ie: this is part of a bigger problem with handling generics and not related to circular dependencies. I think there are many places where this kind of issue isn't being handled correctly.

@vermorel I can confirm that the initial problem was solved completely and the last issue pointed out by @toptensoftware is a duplicate for the #127 that is defined more precisely. I suggest closing this one for now.