Bowen Jiang (Lauren) (bowen-upenn)

bowen-upenn

User data from Github https://github.com/bowen-upenn

Company:GRASP Lab, University of Pennsylvania

Location:Philadelphia, United States

Home Page:https://sites.google.com/seas.upenn.edu/bowenjiang/

GitHub:@bowen-upenn

Twitter:@laurenbjiang

Bowen Jiang (Lauren)'s repositories

ControlText

ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations

Language:PythonLicense:Apache-2.0Stargazers:31Issues:4Issues:2

scene_graph_commonsense

[WACV 2025] Enhancing Scene Graph Generation with Hierarchical Relationships and Commonsense Knowledge

Language:PythonLicense:MITStargazers:27Issues:1Issues:5

Agent_Rationality

[NAACL 2025] Towards Rationality in Language and Multimodal Agents: A Survey

License:MITStargazers:26Issues:1Issues:0

llm_token_bias

[EMNLP 2024] A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners

Language:PythonLicense:MITStargazers:19Issues:2Issues:0

Multi-Agent-VQA

[CVPR 2024 CVinW] Multi-Agent VQA: Exploring Multi-Agent Foundation Models on Zero-Shot Visual Question Answering

Language:PythonLicense:MITStargazers:11Issues:2Issues:0

AnyText

Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Awesome-LLM-Reasoning

Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought and OpenAI o1 🍓

License:MITStargazers:0Issues:0Issues:0

CCD

[ICCV2023] Self-supervised Character-to-Character Distillation for Text Recognition

Language:PythonStargazers:0Issues:0Issues:0

CFR_VQA

Coarse-to-Fine Reasoning for Visual Question Answering (CVPRW'22)

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Image-Generation-CoT

Investigating CoT Reasoning in Autoregressive Image Generation

Language:PythonStargazers:0Issues:0Issues:0

Rethinking-Text-Segmentation

[CVPR 2021] Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach

Language:PythonStargazers:0Issues:0Issues:0

SeeAct

[ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multimodal models (LMMs) such as GPT-4V(ision).

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

VLSAT

CVPR2023 : VL-SAT: Visual-Linguistic Semantics Assisted Training for 3D Semantic Scene Graph Prediction in Point Cloud

Language:PythonStargazers:0Issues:0Issues:0

verl

verl: Volcano Engine Reinforcement Learning for LLMs

License:Apache-2.0Stargazers:0Issues:0Issues:0