bigscience-workshop / data_tooling

Tools for managing datasets for governance and training.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Create dataset stack_exchange_website

albertvillanova opened this issue · comments

  • uid: stack_exchange_website
  • type: primary
  • description:
    • name: Stack Exchange Website
    • description: Launched in 2010, the Stack Exchange network comprises 173 Q&A communities including Stack Overflow, the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.
    • homepage: https://stackexchange.com/
    • validated: True
  • languages:
    • language_names:
      • English
    • language_comments:
    • language_locations:
      • Northern America
    • validated: False
  • custodian:
  • availability:
    • procurement:
      • for_download: No - we would need to spontaneously reach out to the current owners/custodians
      • download_url:
      • download_email: legal@stackoverflow.com
    • licensing:
      • has_licenses: Yes

      • license_text: Subscriber Content

        You agree that any and all content, including without limitation any and all text, graphics, logos, tools, photographs, images, illustrations, software or source code, audio and video, animations, and product feedback (collectively, “Content”) that you provide to the public Network (collectively, “Subscriber Content”), is perpetually and irrevocably licensed to Stack Overflow on a worldwide, royalty-free, non-exclusive basis pursuant to Creative Commons licensing terms (CC BY-SA 4.0), and you grant Stack Overflow the perpetual and irrevocable right and license to access, use, process, copy, distribute, export, display and to commercially exploit such Subscriber Content, even if such Subscriber Content has been contributed and subsequently removed by you as reasonably necessary to, for example (without limitation):

        Provide, maintain, and update the public Network
        Process lawful requests from law enforcement agencies and government agencies
        Prevent and address security incidents and data security features, support features, and to provide technical assistance as it may be required
        Aggregate data to provide product optimization
        

        This means that you cannot revoke permission for Stack Overflow to publish, distribute, store and use such content and to allow others to have derivative rights to publish, distribute, store and use such content. The CC BY-SA 4.0 license terms are explained in further detail by Creative Commons, and the license terms applicable to content are explained in further detail here. You should be aware that all Public Content you contribute is available for public copy and redistribution, and all such Public Content must have appropriate attribution.

        As stated above, by agreeing to these Public Network Terms you also agree to be bound by the terms and conditions of the Acceptable Use Policy incorporated herein, and hereby acknowledge and agree that any and all Public Content you provide to the public Network is governed by the Acceptable Use Policy.

      • license_properties:

        • open license
      • license_list:

        • cc-by-sa-4.0: Creative Commons Attribution Share Alike 4.0 International
    • pii:
      • has_pii: Yes
      • generic_pii_likely: very likely
      • generic_pii_list:
        • names
        • website account name or handle
        • email addresses
      • numeric_pii_likely: somewhat likely
      • numeric_pii_list:
        • telephone numbers
      • sensitive_pii_likely: very likely
      • sensitive_pii_list:
        • political opinions
        • racial or ethnic origin
        • religious or philosophical beliefs
      • no_pii_justification_class:
      • no_pii_justification_text:
    • validated: False
  • source_category:
    • category_type: website
    • category_web: forum
    • category_media:
    • validated: False
  • media:
    • category:
      • text
    • text_format:
      • .HTML
    • audiovisual_format:
    • image_format:
    • database_format:
    • text_is_transcribed: No
    • instance_type: post
    • instance_count: 1M<n<1B
    • instance_size: 100<n<10,000
    • validated: False
  • fname: stack_exchange_website.json