tomquirk / linkedin-api

👨‍💼Linkedin API for Python

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

search_people

yaruqian opened this issue · comments

Hi

It seems search_people does not work again and the return is always empty. Without this, I think it is impossible to use get_profile_network_info since public id is needed.

Does anyone know how to solve this issue? thanks!

I can report that it is very hit and miss, on my use case that is for searching people with the current_company attribute. It returns some people ( when there are no keywords ) but it doesn`t return everyione and neither does it return the relevant people when the keyword is inputted.

commented

Any one know how to fix that? I was using title and current company to search and it always returns empty. The fix-search-metadata branch seems fail to fix that>

So i have been doing some testing on my own.
It seems there are two issues on the searching of people inside companies:

1- the flagshipSearchIntent on the search function has to be ORGANIZATIONS_PEOPLE_ALUMNI in order for the entityResult variable inside each result to not be null
f"flagshipSearchIntent:ORGANIZATIONS_PEOPLE_ALUMNI,"

2- Also in this case if you are calling search_people, you must use the variable include_private_profiles to show all the possible profiles in the return

employees = api.search_people(keywords=["diretor comercial"], current_company=["17938460"], limit=10,include_private_profiles=True)

The problem im having right now is translating the urn_id on the results into a proper universal_name profile id. Using it on the get_profile function returns "request failed: This profile can't be accessed" from linkedin.

I saw that linkedin uses a lazy loading technique to load in the profile ids after searching for the people inside the company so i implemented a method like that.

`

def get_public_identifiers_from_urns(self, urn_ids=None):
    """Fetch public identifiers for a list of LinkedIn URN IDs.

    :param urn_ids: A list of company URN IDs (str)
    :type urn_ids: list[str], optional

    """
    beginning = "urn:li:fsd_lazyLoadedActions:(urn:li:fsd_profileActions:("
    end = ",SEARCH,EMPTY_CONTEXT_ENTITY_URN),PEOPLE,SEARCH_SRP)"

    final_string = ""

    for i in range(0, len(urn_ids)):
        new_urn =  quote(f"{beginning}{urn_ids[i]}{end}")
        final_string += new_urn
        if i != len(urn_ids) - 1:
            final_string += ","

    res = self._fetch(
            f"/graphql?variables=(lazyLoadedActionsUrns:List("
            f"{final_string}"
            f"))&queryId=voyagerSearchDashLazyLoadedActions"
            f".805d3430ded0f28feeae5a3cbd74820b",
        )
    
    data = res.json()

    data_clusters = data.get("included", [])

    if not data_clusters:
        return []

    results = []
    for item in data_clusters:
        # store result in a file
        if (
            not item.get("$type", [])
            == "com.linkedin.voyager.dash.identity.profile.Profile"
        ):
            continue
        results.append(
            {
                "urn_id": get_id_from_urn(
                    get_urn_from_raw_update(item.get("entityUrn", None))
                ),
                "title": item.get(
                    "headline", None
                ),
                "publicIdentifier": item.get(
                    "publicIdentifier", None
                ),
                "firstName": item.get(
                    "firstName", None
                ),
                "lastName": item.get(
                    "lastName", None
                ),
            }
        )

    return results

`

the problem is that this function only loads the profiles that it assumes i can have a connection to ( im assuming this throught some testing i did, though i could be wrong ) so im finding it impossible to get the correct profile ids on the search_people return. If you include the new 'includeWebMetadata=true' in the search function it can also return the profile_ids together with the usual data, but it also only shows profiles that you seem to have a connection to

@Botelho31 this is amazing research - huge thanks for your efforts!

  • ORGANIZATIONS_PEOPLE_ALUMNI is a great find. I'll have a play with it
  • In theory, include_private_profiles does what it says: it'll include profiles that are marked as 'private'. This might mean your search didn't have any public profiles in its result set.
  • I'll have to take a look at the urn_id findings!
  • "function only loads the profiles that it assumes i can have a connection to" - this is definitely accurate. Linkedin will exclude (or severly downrank) profiles in searches if they're too far away from your network. For best results, they should be either 1st, 2nd or 3rd degree connections (you can verify this for yourself on linkedin.com)
  • includeWebMetadata=true has now been merged to master branch and is on the latest PyPI release of linkedin_api.
commented

@Botelho31 Thank you so much!! I tested on my own, it works pretty nice. I guess the only missing part is that urn convert to linkedin url or something like full name. Rightnow, I am scraping the whole search page and trying to match the urn.

also not possible for me to turn the search result into profile like:

person = api.get_profile(urn_id="ACr__7Lxlo8BaSdJJV8ecwmXHvHWK9rR04ShSgs")

INFO:linkedin_api.linkedin:request failed: This profile can't be accessed

Something that I had trouble with was adding the school to the search people filter. So I fixed the endpoint in the search people function from (key:schools,value:List({stringify})) to (key:schoolFilter,value:List({stringify})) and it worked for me. Could be something you include in a future update :)