usc-isi-i2 / Web-Karma

Information Integration Tool

Home Page:http://www.isi.edu/integration/karma/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Multi-line errors in Spark Framer: only the first entity is fully structured

GullyAPCBurns opened this issue · comments

Just looking through the data for our use of Karma, the system executed but seems to be causing some errors. The following record is generated from a multiline model. The 'isTaughtBy' relation in the model below only parses the multi-line data correctly for the first entry.


{
  "@context": "file:/Users/Gully/Documents/Projects/2_active/bigDataU/work/2016-10-11-eruditeKarma/karma-context.json", 
  "a": [
    "LearningResource"
  ], 
  "description": "This is an Archived Course \\\\n EdX keeps courses open for enrollment after they end to allow learners to explore content and continue learning.  All features and materials may not be all available . Check back often to see when new course start dates are announced. \\\\n The world is full of uncertainty: accidents, storms, unruly financial markets, noisy communications. The world is also full of data. Probabilistic modeling and the related field of statistical inference are the keys to analyzing data and making scientifically sound predictions. \\\\n Probabilistic models use the language of mathematics. But instead of relying on the traditional \\theorem - proof\\\" format, we develop the material in an intuitive -- but still rigorous and mathematically precise -- manner. Furthermore, while the applications are multiple and evident, we emphasize the basic concepts and methodologies that are universally applicable. \\\\n The course covers all of the basic probability concepts, including: \\\\n \\\\n multiple discrete or continuous random variables, expectations, and conditional distributions \\\\n laws of large numbers \\\\n the main tools of Bayesian inference methods \\\\n an introduction to random processes (Poisson processes and Markov chains) \\\\n \\\\n The contents of this course are essentially the same as those of the corresponding MIT class ( Probabilistic Systems Analysis and Applied Probability ) -- a course that has been offered and continuously refined over more than 50 years. It is a challenging class, but it will enable you to apply the tools of probability theory to real-world applications or your research. \\\\n Can I still register after the start date? \\\\nYou can register at any time, but you will not get credit for any assignments that are past due. \\\\n How long does this course last? \\\\nThe course starts on Tuesday, February 3, 2015 and ends on the due date of the final exam, on Thursday, May 26, 2015. \\\\n What is the format of the class? \\\\nThe course material is organized along units that are aligned with the chapters of the textbook. Each unit contains between one and three lecture sequences. Each lecture sequence consists of short video clips, interleaved with short problems to test your understanding. Each unit also contains a wealth of supplementary material, including videos that go through the solution of various problems. \\\\n What textbook do I need for the course? \\\\nNone - there is no required textbook. The class follows closely the text Introduction to Probability, 2nd edition, by Bertsekas and Tsitsiklis, Athena Scientific, 2008; see the publisher's website or Amazon.com for more information. However, while this textbook is recommended, the materials provided by this course are self-contained. \\\\n Do I need to watch the lectures live? \\\\nVideo lectures as well as worked problems will be available and you can watch these at your own convenience. Homework assignments and exams, however, will have due dates. \\\\n Will the text of the video clips be available? \\\\nYes, we will provide transcripts of all clips (lectures, worked problems, etc.) that are synched to the videos. \\\\n How are grades assigned? \\\\nGrades (Pass or Not Pass) are based on a combination of scores on the weekly homework assignments (11 total), two midterm exams, and a final exam. \\\\n How much do I need to work for this class? \\\\nThis is an ambitious class in that it covers a lot of material in substantial depth. Furthermore, the only way of mastering the subject is by actually solving on your own a fair number of problems. MIT students who take the corresponding residential class typically report an average of 11-12 hours spent each week, including lectures, recitations, readings, homework, and exams.\"", 
  "hasTag": [
    " MOOC", 
    " probability_statistics ", 
    " overview ", 
    "video "
  ], 
  "isProvidedBy": [
    {
      "a": "Provider", 
      "full_name": "edX ", 
      "uri": "http://xcri.co.uk/Provider#edX_"
    }, 
    "http://xcri.co.uk/Provider#_MITx"
  ], 
  "isTaughtBy": [
    {
      "a": [
        "Person"
      ], 
      "full_name": " Dimitri Bertsekas ", 
      "uri": "http://schema.org/Person#_Dimitri_Bertsekas_"
    }, 
    "http://schema.org/Person#_John_Tsitsiklis_", 
    "http://schema.org/Person#_Zied_Ben_Chaouch_", 
    "http://schema.org/Person#Kuang_Xu_", 
    "http://schema.org/Person#_Qing_He_", 
    "http://schema.org/Person#_Jimmy_Li", 
    "http://schema.org/Person#_Katie_Szeto_", 
    "http://schema.org/Person#_Jagdish_Ramakrishnan_", 
    "http://schema.org/Person#_Patrick_Jaillet_"
  ], 
  "subtitle": "An introduction to probabilistic models, including random processes and the basic elements of statistical inference.", 
  "title": "Introduction to Probability - The Science of Uncertainty", 
  "uri": "https://www.edx.org/course/introduction-probability-science-mitx-6-041x-1"
}

What are the workflow.apply_framer parmeters that you are using? The last parameter is the number of objects it should join when framing and if you set it to None, it should join all of them.