tuist / XcodeProj

📝 Read, update and write your Xcode projects

Home Page:https://xcodeproj.tuist.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PBXFileElement's (and PBXObject's?) implementation of Hashable is incorrect. Equal objects can have unequal hashes.

deatondg opened this issue · comments

I chased an issue in one of my projects down to strange behavior of PBXFileElement being used in a Dictionary or Set. I am able to get strange results and also a crash somewhat reproducibly. I have added a test case for this in my fork which I'll also put here:

    func test_canBeUsedInASet() throws {
        // Run this a bunch of times because, I assume, the hasher seed changes each time
        for _ in 0...300 {
            let xcodeProj = try XcodeProj(path: fixturesPath() + "iOS/Project.xcodeproj")
            let pbxproj = xcodeProj.pbxproj
            
            let filesArray: [PBXFileElement] = pbxproj.fileReferences + pbxproj.groups + pbxproj.variantGroups
            
            var filesSet: Set<PBXFileElement> = []
            filesSet.formUnion(filesArray)
            
            for file in filesArray {
                if !filesSet.contains(file) {
                    XCTFail("""
                    PBXFileElement's implementation of Hashable is broken somehow.
                    """)
                }
            }
            for file in filesSet {
                if !filesArray.contains(file) {
                    XCTFail("""
                    PBXFileElement's implementation of Hashable is broken somehow.
                    """)
                }
            }
        }
    }

Essentially, this test makes an array of all PBXFileElements by concatenating the PBXFileReference, PBXGroup, and PBXVariantGroups, then adds this array into an empty set with formUnion, then checks if the array and the set have the same elements. Without the for loop, this succeeds most times, but it fails every 100 iterations or so. When it fails, either some element is missing from the set, or Swift crashes with

Fatal error: Duplicate elements of type 'PBXFileElement' were found in a Set.
This usually means either that the type violates Hashable's requirements, or
that members of such a set were mutated after insertion.

Unless I deeply misunderstand something, this test should pass and should definitely not crash. I will probably trace down the root cause sometime in the near-ish future, but I figured I'd report my findings first.

Not sure why I didn't try testing this before, but I observe this same behavior with let filesArray = pbxproj.fileReferences too, so you don't have to type-erase into PBXFileElement or combine the files and groups together for this to be an issue.

I have narrowed it down significantly. Here is a test which should pass but does not.

    func test_equalImpliesEqualHash() throws {
        let xcodeProj = try XcodeProj(path: fixturesPath() + "iOS/Project.xcodeproj")
        let pbxproj = xcodeProj.pbxproj
        
        let files = pbxproj.fileReferences
        
        for file1 in files {
            for file2 in files {
                if file1 == file2 {
                    XCTAssert(file1.hashValue == file2.hashValue, "PBXFileReference violates Hashable's requirements. Equal file references have unequal hashes.")
                }
            }
        }
    }

Further, this fails every time I've run it, rather than only crashing sporadically.

In the cases where this fails (using the Fixture project and one of my own, autogenerated by SPM), there are two files with the same name. In the fixture, the project knows of two Info.plist's, but they are different files with different parent groups. These files are incorrectly determined to be equal.

I think I've narrowed down the issue to a few possible places.

PBXFileElement's parent

PBXFileElement has a field parent which is not checked in the auto-generated equality method.

extension PBXFileElement {
    /// :nodoc:
    @objc override public func isEqual(to object: Any?) -> Bool {
        guard let rhs = object as? PBXFileElement else { return false }
        if sourceTree != rhs.sourceTree { return false }
        if path != rhs.path { return false }
        if name != rhs.name { return false }
        if includeInIndex != rhs.includeInIndex { return false }
        if usesTabs != rhs.usesTabs { return false }
        if indentWidth != rhs.indentWidth { return false }
        if tabWidth != rhs.tabWidth { return false }
        if wrapsLines != rhs.wrapsLines { return false }
        return super.isEqual(to: rhs)
    }
}

Adding this check makes all my tests pass

extension PBXFileElement {
    /// :nodoc:
    @objc override public func isEqual(to object: Any?) -> Bool {
        ...
        if parent != rhs.parent { return false }
        return super.isEqual(to: rhs)
    }
}

but might also be undesired behavior (is parent something intrinsic to the project, or is it just computed by scanning all children?), and could possibly introduce an expensive/infinite computation if there are many grandparents/a parent cycle.

However, I don't think this addresses the fundamental issue.

PBXObject.isEqual always returns true

PBXObject defines equality and hashes like so

    @objc dynamic func isEqual(to _: Any?) -> Bool {
        true
    }

    public func hash(into hasher: inout Hasher) {
        hasher.combine(reference)
    }

This is clearly incorrect behavior because true could be true without the underlying references being the same. Maybe this is okay if PBXObject is expected to be an abstract class, but it could easily cause problems if a PBXObject is initialized directly. My tests pass if equality on PBXObject is replaced with

    @objc dynamic func isEqual(to other: Any?) -> Bool {
        guard let other = other as? PBXObject else { return false }
        return self.reference == other.reference
    }

but other tests centered around equality now fail.

It is my opinion that these new failures are justified: I don't think two objects should be equal unless their underlying references are. For example, in the case with the iOS fixture, there are two Info.plist's which share identical properties, but are legitimately different references. As a more contrived example, what if objects are compared across two different Xcode projects?

Proposal

IMO, the correct behavior here is that all children of PBXObject are equal if and only if their underlying references are equal. Consequently, I would recommend deleting the majority of the Stencil generated equality methods, using the new PBXObject equality above, and changing the failing tests accordingly. However, this will change the behavior of == so could be a breaking change for some users.

In particular, if you store two copies to a reference and change all of the properties of one of them, it's still equal to the old one. I think this is probably reasonable as only a single one of the references could exist in the same Xcode project anyway. Arguably, we could switch to factory methods or something to make sure that only one Swift reference to a given Xcode reference can exist at one time, but that is probably more work than it's worth.

Summary

The current behavior of PBXFileElement.isEqual is unacceptable; it can result in a crash for the legitimate use case of using PBXFileReferences as keys to dictionaries. I see a few options:

  1. Update PBXFileElement.isEqual to check for equal parents and (informally) declare PBXObject to be abstract.
  2. Update PBXObject to compare references
  3. Update PBXObject to compare references and delete all the override's of PBXObject's equality
  4. Just decide that PBXObject shouldn't be hashable and remove support for using them as keys to dictionaries entirely.

My preference is option 3. I would be more than happy to make a PR with any one of these changes, but I should not be the one to make a decision about which fix to implement. So. What do y'all think? What strategy is correct here?

You can reveal the issue everywhere with a runtime assertion like this:

diff --git a/Sources/XcodeProj/Objects/Project/PBXObject.swift b/Sources/XcodeProj/Objects/Project/PBXObject.swift
index 780f415..939a15a 100644
--- a/Sources/XcodeProj/Objects/Project/PBXObject.swift
+++ b/Sources/XcodeProj/Objects/Project/PBXObject.swift
@@ -57,7 +57,16 @@ public class PBXObject: Hashable, Decodable, Equatable, AutoEquatable {
 
     public static func == (lhs: PBXObject,
                            rhs: PBXObject) -> Bool {
-        lhs.isEqual(to: rhs)
+        let equal = lhs.isEqual(to: rhs)
+        #if DEBUG
+        if equal {
+            var (lh, rh) = (Hasher(), Hasher())
+            lh.combine(lhs)
+            rh.combine(rhs)
+            assert(lh.finalize() == rh.finalize(), "\(lhs) and \(rhs) are equal values, so they must have equal hash values")
+        }
+        #endif
+        return equal
     }
 
     func isEqual(to _: Any?) -> Bool {

This definitely feels like an issue, since it breaks Hashable's invariant. But I think replacing all our comparison logic with reference checking would be a dangerous behavior-break.

My understanding of XcodeProj is that it keeps references fairly transparent to the user. For instance, you can programmatically create an object, and it'll have a "TEMP-" UUID value as its reference until the project is encoded. The "value" of a PBXObject doesn't change when you encode the project, so it would be confusing if the values it was equal to changed after calling PBXProjEncoder.encode().

The right thing to do is probably to follow this note in Hashable's documentation:

Implement this method to conform to the Hashable protocol. The components used for hashing must be the same as the components compared in your type’s == operator implementation. Call hasher.combine(_:) with each of these components.

which would mean generating "memberwise" hashing methods using Sourcery, which feed every property we use for equality-checking into the hasher.

It seems okay to me to hash everything except the reference like is currently done for equality, but that still makes me uncomfortable.

I still think the current equality method (which was updated to include the parent) is incorrect. It is not difficult to find a situation in which an object has identical properties except for its reference. If I recall correctly, in one of the fixtures there is a project with two Info.plist files whose properties differ only by their reference and their parent. They manage to be the same because their path properties are both the relative path Info.plist, which is what Xcode sets it to if you make a new app target. However, the .parent property is not part of the xcodeproj spec, it is just computed and stored for convenience when a project is loaded, so I don't think .parent should be in the equality check either. But even if it is, if a user creates their own objects, they shouldn't have to set the .parent property themselves (since it isn't part of the spec).

Either way, it is conceivable that a user might create two objects with the same properties except the reference. Because these could be assigned as children to different objects, these are not exchangeable and should be compared unequal: one of them could (eventually) be set as the Info.plist of target A and the other of target B. Thus, it would be incorrect if adding these to a Set overwrote each other or if they were not distinct keys in a Dictionary (my use case). Thus in my opinion, the equality method can only be correct if it includes the reference.

If the library guarantees that it returns the same reference (as in the same Swift reference, compared using ===) every time it gives you an object with a fixed (xcodeproj) reference, this wouldn't cause problems on encoding because all equal objects would have their reference changed to the same thing at the same time. There are other solutions too if that is unsatisfactory.

Hola 👋,

We want to inform you that the issue has been marked as stale. This means that there hasn't been any activity or updates on it for quite some time, and it's possible that it may no longer be relevant or actionable.
If you still believe that this issue is valid and requires attention, please provide an update or any additional information that can help us address it. Otherwise, we may consider closing it in the near future.
Thank you for your understanding.

Hola 👋,

We want to inform you that we have decided to close this stale issue as there hasn't been any activity or response regarding it after marking it as stale.

We understand that circumstances may have changed or priorities may have shifted, and that's completely understandable. If you still believe that this issue needs to be addressed, please feel free to reopen it and provide any necessary updates or additional information.

We appreciate your understanding and look forward to your continued contributions to the project.

Thank you.