Media Content Analysis

Description

Description Generator Concept; CV Concept

Content

[Complete] Image Data input (jpg, etc.…)
[Complete] Object detection: Fine-tune VGG166 Image Classification (subject, verb and confidence value)
- Build
- Train
- Predict
[Complete] Scene detection: Keras + VGG16 + Places365 (place and confidence value)
[incomplete] Category Filter
- [Complete] Fasttext & Textgrocery (Text Category and confidence value)
- [Progressing] LCS Confidence + Subject Confidence
[incomplete] NLP - Generate Sentence & Probabilistic
- n-gram: [S V O P]
- NLTK

Outputs Data model

Recognization Data Format:

"Description": {
    "tags": [{
        "obj": "string",
        "label": "string",
        "confidence": value
    }, …],
    "caption": [{
        "text": "string",
        "confidence": value
    }],
    "category": [{
        "name": "string",
        "score": value
    }]
}

Detection&Recognition:

Keras + CRFasRNN

MaskRNN(attempt)...[Failed]
subject + facial + action + scene

NLG(Natural Language Generation):

n-gram + NLTK

Raw text Processing
Sentence Segmentation (lists of strings)
Tokenization (sentences)
Categorizing and Tagging words (tokenized sentences)
Extracting, recognize the entities (pos-tagged sentences)
Analyzing sentence structure (chunked sentences)
Build grammars (relations)

Result:

Subject & score  
+  
Verb & score  
+  
Object & score  
+  
Context & score  
+  
Place & score

Reference

About

Source code for Paper in IEEE ICSC 2020: A Refined Neural Network Recognition Architecture for Blurred Image Semantic Generalization

GNU General Public License v3.0

Languages

Language:Jupyter Notebook 67.3%Language:Python 26.3%Language:C++ 3.8%Language:Makefile 1.7%Language:Dockerfile 0.9%Language:Shell 0.0%