microsoft / win32metadata

Tooling to generate metadata for Win32 APIs in the Windows SDK.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cannot specify `cString` in `ScriptStringAnalyse`

wuweiran opened this issue · comments

Summary

windows::Win32::Globalization::ScriptStringAnalyse puts the third parameter of its corresponding ffi ScriptStringAnalyse as pidx.as_deref().map_or(0, |slice| slice.len().try_into().unwrap()). This prevents user from setting cString (length of the string to analyze) without specifying piDx which is optional. Please report this for me if it is caused by incorrect Win32 metadata.

Crate manifest

[dependencies.windows]
version = "0.54.0"

Crate code

ScriptStringAnalyse(
    udc,
    context.buffer.as_ptr() as _,
    (1.5 * length as f32 + 16f32) as i32,
    -1,
    SSA_LINK | SSA_FALLBACK | SSA_GLYPHS,
    -1,
    None,
    None,
    None, // cannot specify `cString` with this parameter being `None`
    None,
    null(),
    &mut context.ssa,
)?;

The NativeArrayInfo attribute binds the two parameters together. Perhaps the SAL annotation is incorrect, or perhaps its interpretation is incorrect.

I don't know anything about this function so I can't be sure, but I can transfer to the Win32 metadata repo for consideration.

This looks like a genuine code-gen bug, where the [Optional] attribute of a parameter is retroactively applied to a non-optional parameter referenced in SAL annotations.

This is the (abridged) C definition:

HRESULT WINAPI ScriptStringAnalyse(
    // ...
    int                               cString,    //In  Length in characters (Must be at least 1)
    // ...
    _In_reads_opt_(cString) const int *piDx,      //In  Requested logical dx array
    // ...
);

The corresponding metadata accurately captures the signature information:

HRESULT ScriptStringAnalyse(
    // ...
    [In] int cString,
    // ...
    [Optional][In][Const][NativeArrayInfo(CountParamIndex = 2)] int* piDx,
    // ...
);

The generator uses this data in preparation for the "ptr + length" -> "slice" transformation but fails to acknowledge that while the ptr is optional, the length is not.

The [NativeArrayInfo] attribute is initially evaluated here:

https://github.com/microsoft/windows-rs/blob/994dc7519fcb3ece2035f7fc4607db9ca6a8047c/crates/libs/bindgen/src/metadata.rs#L321-L328

While I understand that the current Rust signature for ScriptStringAnalyse() leaves the API partially unusable, I'm struggling to come up with a workable solution to address this issue. At the very least, the code generator needs to inhibit the "ptr + length" -> "slice" transformation if either parameter is [Optional].

This could be controlled from metadata with the [PreserveSig = true] attribute with effects on all downstream clients (including this repo). The alternative would be to deal with this in the code generator. I don't know how complex that would be, or even just which of the four [Optional] combinations are "transformation safe".

I think some of the confusion can be found here:

_Check_return_ HRESULT WINAPI ScriptStringAnalyse(
    HDC                                             hdc,        //In  Device context (required)
    const void                                      *pString,   //In  String in 8 or 16 bit characters
    int                                             cString,    //In  Length in characters (Must be at least 1)
    int                                             cGlyphs,    //In  Required glyph buffer size (default cString*1.5 + 16)
    int                                             iCharset,   //In  Charset if an ANSI string, -1 for a Unicode string
    DWORD                                           dwFlags,    //In  Analysis required
    int                                             iReqWidth,  //In  Required width for fit and/or clip
    _In_reads_opt_(1) SCRIPT_CONTROL               *psControl, //In  Analysis control (optional)
    _In_reads_opt_(1) SCRIPT_STATE                 *psState,   //In  Analysis initial state (optional)
    _In_reads_opt_(cString) const int              *piDx,      //In  Requested logical dx array
    _In_reads_opt_(1) SCRIPT_TABDEF                *pTabdef,   //In  Tab positions (optional)
    const BYTE                                      *pbInClass, //In  Legacy GetCharacterPlacement character classifications (deprecated)
    _Outptr_result_buffer_(1) SCRIPT_STRING_ANALYSIS    *pssa);     //Out Analysis of string

There appears to be a relationship between pString and cString in the comments, but SAL only indicates a relationship between cString and piDx, so that's the one that windows-bindgen (the Rust code generator) honors since it doesn't read comments...

There are other APIs where multiple parameters point to the same length parameter and in such cases windows-bindgen will reject the relationship and thus omit the transformation.

Thanks for the thorough analysis, Kenny. That makes sense.

The actual fix would then be to update the SAL annotations for ScriptStringAnalyse(), and windows-bindgen will do the right thing as the changes eventually trickle down.

Specifically, pString needs to be associated with cString. This is somewhat complicated as cString is measured in characters, but pString can be either a narrow or wide character string (depending on the value of iCharset). Plus, pString is typed as a void*, so _In_reads_(s) does nothing as there is no type information to scale s.

I played around with the annotations a bit longer and came up with this:

_In_reads_bytes_(((iCharset == -1) + 1) * cString) const void *pString
//               ^^^^^^^^^^^^^^^^^^^^^^ 2 if iCharset == -1 (Unicode)
//                                      1 otherwise (ANSI)

This reliably reports pre-condition violations during code analysis of a C program. While (presumably) correct, I don't know how well ClangSharp or the metadata artifacts can handle complex SAL annotations.

Metadata has no SAL support to speak of, so we'd be manually adding any attributes needed here.