2024-07-31 |
The Llama 3 Herd of Models |
Abhimanyu Dubey et.al. |
2407.21783v1 |
null |
2024-07-31 |
RainMamba: Enhanced Locality Learning with State Space Models for Video Deraining |
Hongtao Wu et.al. |
2407.21773v1 |
null |
2024-07-31 |
ReplanVLM: Replanning Robotic Tasks with Visual Language Models |
Aoran Mei et.al. |
2407.21762v1 |
null |
2024-07-31 |
Learning Video Context as Interleaved Multimodal Sequences |
Kevin Qinghong Lin et.al. |
2407.21757v1 |
null |
2024-07-31 |
Topological Woodward-Hoffmann classification for cycloadditions in polycyclic aromatic azomethine ylides |
Juan Li et.al. |
2407.21756v1 |
null |
2024-07-31 |
A Federated Learning-Friendly Approach for Parameter-Efficient Fine-Tuning of SAM in 3D Segmentation |
Mothilal Asokan et.al. |
2407.21739v1 |
null |
2024-07-31 |
Leveraging Self-Supervised Learning for Fetal Cardiac Planes Classification using Ultrasound Scan Videos |
Joseph Geo Benjamin et.al. |
2407.21738v1 |
null |
2024-07-31 |
Artificial Intelligence Approaches for Energy Efficiency: A Review |
Alberto Pasqualetto et.al. |
2407.21726v1 |
null |
2024-07-31 |
Open-Vocabulary Audio-Visual Semantic Segmentation |
Ruohao Guo et.al. |
2407.21721v1 |
null |
2024-07-31 |
Tora: Trajectory-oriented Diffusion Transformer for Video Generation |
Zhenghao Zhang et.al. |
2407.21705v1 |
null |
2024-07-30 |
Contrasting Deep Learning Models for Direct Respiratory Insufficiency Detection Versus Blood Oxygen Saturation Estimation |
Marcelo Matheus Gauy et.al. |
2407.20989v1 |
null |
2024-07-30 |
Transfer Learning for Multi-material Classification of Transition Metal Dichalcogenides with Atomic Force Microscopy |
Isaiah A. Moses et.al. |
2407.20975v1 |
null |
2024-07-30 |
MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions |
Xiaowei Chi et.al. |
2407.20962v1 |
link |
2024-07-30 |
EAR: Edge-Aware Reconstruction of 3-D vertebrae structures from bi-planar X-ray images |
Lixing Tan et.al. |
2407.20937v1 |
null |
2024-07-30 |
Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering |
Yanpeng Zhao et.al. |
2407.20908v1 |
link |
2024-07-30 |
Simultaneous Multi-Slice Diffusion Imaging using Navigator-free Multishot Spiral Acquisition |
Yuancheng Jiang et.al. |
2407.20904v1 |
null |
2024-07-30 |
Faithful and Plausible Natural Language Explanations for Image Classification: A Pipeline Approach |
Adam Wojciechowski et.al. |
2407.20899v1 |
null |
2024-07-30 |
MambaCapsule: Towards Transparent Cardiac Disease Diagnosis with Electrocardiography Using Mamba Capsule Network |
Yinlong Xu et.al. |
2407.20893v1 |
null |
2024-07-30 |
Shift operators and their classification |
Maria Carvalho et.al. |
2407.20890v1 |
null |
2024-07-30 |
Effective Black Box Testing of Sentiment Analysis Classification Networks |
Parsa Karbasizadeh et.al. |
2407.20884v1 |
null |
2024-07-29 |
SANGRIA: Surgical Video Scene Graph Optimization for Surgical Workflow Prediction |
Çağhan Köksal et.al. |
2407.20214v1 |
null |
2024-07-30 |
SpaER: Learning Spatio-temporal Equivariant Representations for Fetal Brain Motion Tracking |
Jian Wang et.al. |
2407.20198v2 |
null |
2024-07-29 |
Radiance Fields for Robotic Teleoperation |
Maximum Wilder-Smith et.al. |
2407.20194v1 |
null |
2024-07-29 |
Theia: Distilling Diverse Vision Foundation Models for Robot Learning |
Jinghuan Shang et.al. |
2407.20179v1 |
link |
2024-07-29 |
LatentArtiFusion: An Effective and Efficient Histological Artifacts Restoration Framework |
Zhenqi He et.al. |
2407.20172v1 |
link |
2024-07-29 |
Diffusion Feedback Helps CLIP See Better |
Wenxuan Wang et.al. |
2407.20171v1 |
null |
2024-07-29 |
Language-Conditioned Offline RL for Multi-Robot Navigation |
Steven Morad et.al. |
2407.20164v1 |
null |
2024-07-29 |
Quantum Machine Learning Architecture Search via Deep Reinforcement Learning |
Xin Dai et.al. |
2407.20147v1 |
null |
2024-07-30 |
AxiomVision: Accuracy-Guaranteed Adaptive Visual Model Selection for Perspective-Aware Video Analytics |
Xiangxiang Dai et.al. |
2407.20124v2 |
link |
2024-07-29 |
Integrable and superintegrable quantum mechanical systems with position dependent masses invariant with respect to one parametric Lie groups. 2. Systems with dilatation and shift symmetries |
A. G. Nikitin et.al. |
2407.20112v1 |
null |
2024-07-26 |
HRP: Human Affordances for Robotic Pre-Training |
Mohan Kumar Srirama et.al. |
2407.18911v1 |
null |
2024-07-26 |
Wolf: Captioning Everything with a World Summarization Framework |
Boyi Li et.al. |
2407.18908v1 |
null |
2024-07-26 |
A Scalable Quantum Non-local Neural Network for Image Classification |
Sparsh Gupta et.al. |
2407.18906v1 |
link |
2024-07-26 |
Unifying Visual and Semantic Feature Spaces with Diffusion Models for Enhanced Cross-Modal Alignment |
Yuze Zheng et.al. |
2407.18854v1 |
null |
2024-07-26 |
The Role of Temporal Hierarchy in Spiking Neural Networks |
Filippo Moro et.al. |
2407.18838v1 |
null |
2024-07-26 |
Learning the Chaotic and Regular Nature of Trajectories in Hamiltonian Systems with Lagrangian descriptors |
Javier Jiménez López et.al. |
2407.18831v1 |
null |
2024-07-26 |
Binary orbit and disks properties of the RW Aur system using ALMA observations |
N. T. Kurtovic et.al. |
2407.18828v1 |
null |
2024-07-26 |
Three-dimensional ultrasound-based online system for automated ovarian follicle measurement |
Pedro Royo et.al. |
2407.18818v1 |
null |
2024-07-26 |
Automatic Detection of Moral Values in Music Lyrics |
Vjosa Preniqi et.al. |
2407.18787v1 |
null |
2024-07-26 |
Deep learning interpretable analysis for carbon star identification in Gaia DR3 |
Shuo Ye et.al. |
2407.18754v1 |
null |
2024-07-25 |
Review of Degenerate Higher Order Scalar Tensor Theories in Cosmology |
Andrei Lazanu et.al. |
2407.18234v1 |
null |
2024-07-25 |
One-point Statistics in various cosmic environments in the presence of massive neutrinos |
Mohadese Khoshtinat et.al. |
2407.18233v1 |
null |
2024-07-26 |
Enhanced Depth Estimation and 3D Geometry Reconstruction using Bayesian Helmholtz Stereopsis with Belief Propagation |
Razieh Azizi et.al. |
2407.18195v2 |
null |
2024-07-25 |
PianoMime: Learning a Generalist, Dexterous Piano Player from Internet Demonstrations |
Cheng Qian et.al. |
2407.18178v1 |
null |
2024-07-26 |
On-chip near-infrared spectroscopic sensing with over 520nm bandwidth |
Chunhui Yao et.al. |
2407.18172v2 |
null |
2024-07-25 |
IRIS: Wireless Ring for Vision-based Smart Home Interaction |
Maruchi Kim et.al. |
2407.18141v1 |
null |
2024-07-25 |
XS-VID: An Extremely Small Video Object Detection Dataset |
Jiahao Guo et.al. |
2407.18137v1 |
null |
2024-07-25 |
Estimating Earthquake Magnitude in Sentinel-1 Imagery via Ranking |
Daniele Rege Cambrin et.al. |
2407.18128v1 |
null |
2024-07-25 |
Self-supervised pre-training with diffusion model for few-shot landmark detection in x-ray images |
Roberto Di Via et.al. |
2407.18125v1 |
null |
2024-07-25 |
Multi-Resolution Histopathology Patch Graphs for Ovarian Cancer Subtyping |
Jack Breen et.al. |
2407.18105v1 |
link |
2024-07-24 |
SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency |
Yiming Xie et.al. |
2407.17470v1 |
null |
2024-07-24 |
SoNIC: Safe Social Navigation with Adaptive Conformal Inference and Constrained Reinforcement Learning |
Jianpeng Yao et.al. |
2407.17460v1 |
null |
2024-07-24 |
EuroCropsML: A Time Series Benchmark Dataset For Few-Shot Crop Type Classification |
Joana Reuss et.al. |
2407.17458v1 |
null |
2024-07-24 |
HumanVid: Demystifying Training Data for Camera-controllable Human Image Animation |
Zhenzhi Wang et.al. |
2407.17438v1 |
link |
2024-07-24 |
Systematic study of High $E_J/E_C$ transmon qudits up to $d = 12$ |
Z. Wang et.al. |
2407.17407v1 |
null |
2024-07-24 |
Self-Calibrated Variance-Stabilizing Transformations for Real-World Image Denoising |
Sébastien Herbreteau et.al. |
2407.17399v1 |
null |
2024-07-24 |
Sampling-Based Hierarchical Trajectory Planning for Formation Flight |
Qingzhao Liu et.al. |
2407.17392v1 |
null |
2024-07-24 |
2D and 3D Deep Learning Models for MRI-based Parkinson's Disease Classification: A Comparative Analysis of Convolutional Kolmogorov-Arnold Networks, Convolutional Neural Networks, and Graph Convolutional Networks |
Salil B Patel et.al. |
2407.17380v1 |
null |
2024-07-24 |
Entropy Reweighted Conformal Classification |
Rui Luo et.al. |
2407.17377v1 |
null |
2024-07-24 |
MuST: Multi-Scale Transformers for Surgical Phase Recognition |
Alejandra Pérez et.al. |
2407.17361v1 |
link |
2024-07-23 |
Explanation Regularisation through the Lens of Attributions |
Pedro Ferreira et.al. |
2407.16693v1 |
null |
2024-07-23 |
On the local cohomology of secant varieties |
Sebastian Olano et.al. |
2407.16688v1 |
null |
2024-07-23 |
AutoRG-Brain: Grounded Report Generation for Brain MRI |
Jiayu Lei et.al. |
2407.16684v1 |
null |
2024-07-24 |
Goedel logics: Prenex fragments |
Matthias Baaz et.al. |
2407.16683v2 |
null |
2024-07-24 |
A Simulation Benchmark for Autonomous Racing with Large-Scale Human Data |
Adrian Remonda et.al. |
2407.16680v2 |
link |
2024-07-23 |
From Imitation to Refinement -- Residual RL for Precise Visual Assembly |
Lars Ankile et.al. |
2407.16677v1 |
null |
2024-07-23 |
FakingRecipe: Detecting Fake News on Short Video Platforms from the Perspective of Creative Process |
Yuyan Bu et.al. |
2407.16670v1 |
null |
2024-07-23 |
EgoCVR: An Egocentric Benchmark for Fine-Grained Composed Video Retrieval |
Thomas Hummel et.al. |
2407.16658v1 |
link |
2024-07-23 |
Fluorescence Diffraction Tomography using Explicit Neural Fields |
Renzhi He et.al. |
2407.16657v1 |
null |
2024-07-23 |
MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequence |
Canyu Zhao et.al. |
2407.16655v1 |
null |
2024-07-22 |
AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description |
Junyu Xie et.al. |
2407.15850v1 |
link |
2024-07-22 |
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models |
Mingze Xu et.al. |
2407.15841v1 |
null |
2024-07-23 |
QueST: Self-Supervised Skill Abstractions for Learning Continuous Control |
Atharva Mete et.al. |
2407.15840v2 |
null |
2024-07-22 |
Enhancing Cell Instance Segmentation in Scanning Electron Microscopy Images via a Deep Contour Closing Operator |
Florian Robert et.al. |
2407.15817v1 |
null |
2024-07-22 |
Learning to Manipulate Anywhere: A Visual Generalizable Framework For Reinforcement Learning |
Zhecheng Yuan et.al. |
2407.15815v1 |
null |
2024-07-22 |
The Evaporating Massive Embedded Stellar Cluster IRS 13 Close to Sgr A. II. Kinematic structure* |
Florian Peißker et.al. |
2407.15800v1 |
null |
2024-07-22 |
Adaptive Extensions of Unbiased Risk Estimators for Unsupervised Magnetic Resonance Image Denoising |
Reeshad Khan et.al. |
2407.15799v1 |
null |
2024-07-23 |
Disentangling spatio-temporal knowledge for weakly supervised object detection and segmentation in surgical video |
Guiqiu Liao et.al. |
2407.15794v2 |
null |
2024-07-22 |
LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding |
Haoning Wu et.al. |
2407.15754v1 |
link |
2024-07-22 |
SAM2CLIP2SAM: Vision Language Model for Segmentation of 3D CT Scans for Covid-19 Detection |
Dimitrios Kollias et.al. |
2407.15728v1 |
null |
2024-07-19 |
DEPICT: Diffusion-Enabled Permutation Importance for Image Classification Tasks |
Sarah Jabbour et.al. |
2407.14509v1 |
null |
2024-07-19 |
T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation |
Kaiyue Sun et.al. |
2407.14505v1 |
null |
2024-07-19 |
Nonlinear Schrödinger Network |
Yiming Zhou et.al. |
2407.14504v1 |
null |
2024-07-19 |
Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery |
Sukrut Rao et.al. |
2407.14499v1 |
link |
2024-07-19 |
Enhancing Layout Hotspot Detection Efficiency with YOLOv8 and PCA-Guided Augmentation |
Dongyang Wu et.al. |
2407.14498v1 |
null |
2024-07-19 |
Evaluating the Reliability of Self-Explanations in Large Language Models |
Korbinian Randl et.al. |
2407.14487v1 |
link |
2024-07-19 |
Co-synthesis of Histopathology Nuclei Image-Label Pairs using a Context-Conditioned Joint Diffusion Model |
Seonghui Min et.al. |
2407.14434v1 |
null |
2024-07-19 |
Dataset Distillation in Medical Imaging: A Feasibility Study |
Muyang Li et.al. |
2407.14429v1 |
null |
2024-07-19 |
Controllable and Efficient Multi-Class Pathology Nuclei Data Augmentation using Text-Conditioned Diffusion Models |
Hyun-Jic Oh et.al. |
2407.14426v1 |
null |
2024-07-19 |
Improving classification of road surface conditions via road area extraction and contrastive learning |
Linh Trinh et.al. |
2407.14418v1 |
null |
2024-07-18 |
GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model |
Abdelrahman Shaker et.al. |
2407.13772v1 |
null |
2024-07-18 |
Addressing Imbalance for Class Incremental Learning in Medical Image Classification |
Xuze Hao et.al. |
2407.13768v1 |
null |
2024-07-18 |
Shape of Motion: 4D Reconstruction from a Single Video |
Qianqian Wang et.al. |
2407.13764v1 |
null |
2024-07-18 |
Streetscapes: Large-scale Consistent Street View Generation Using Autoregressive Video Diffusion |
Boyang Deng et.al. |
2407.13759v1 |
null |
2024-07-18 |
Exploring Facial Biomarkers for Depression through Temporal Analysis of Action Units |
Aditya Parikh et.al. |
2407.13753v1 |
null |
2024-07-18 |
Temporal Representation Learning for Stock Similarities and Its Applications in Investment Management |
Yoontae Hwang et.al. |
2407.13751v1 |
null |
2024-07-18 |
Pose-guided multi-task video transformer for driver action recognition |
Ricardo Pizarro et.al. |
2407.13750v1 |
null |
2024-07-18 |
Multi-Label Learning with Stronger Consistency Guarantees |
Anqi Mao et.al. |
2407.13746v1 |
null |
2024-07-18 |
Realizable $H$-Consistent and Bayes-Consistent Loss Functions for Learning to Defer |
Anqi Mao et.al. |
2407.13732v1 |
null |
2024-07-18 |
Enhanced $H$-Consistency Bounds |
Anqi Mao et.al. |
2407.13722v1 |
null |
2024-07-17 |
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control |
Sherwin Bahmani et.al. |
2407.12781v1 |
null |
2024-07-17 |
Hallucination Index: An Image Quality Metric for Generative Reconstruction Models |
Matthew Tivnan et.al. |
2407.12780v1 |
null |
2024-07-17 |
LookupViT: Compressing visual information to a limited number of tokens |
Rajat Koner et.al. |
2407.12753v1 |
null |
2024-07-17 |
4Dynamic: Text-to-4D Generation with Hybrid Priors |
Yu-Jie Yuan et.al. |
2407.12684v1 |
null |
2024-07-17 |
Goldfish: Vision-Language Understanding of Arbitrarily Long Videos |
Kirolos Ataallah et.al. |
2407.12679v1 |
null |
2024-07-17 |
Promptable Counterfactual Diffusion Model for Unified Brain Tumor Segmentation and Generation with MRIs |
Yiqing Shen et.al. |
2407.12678v1 |
null |
2024-07-17 |
CoSIGN: Few-Step Guidance of ConSIstency Model to Solve General INverse Problems |
Jiankun Zhao et.al. |
2407.12676v1 |
link |
2024-07-17 |
Distilling Tiny and Ultra-fast Deep Neural Networks for Autonomous Navigation on Nano-UAVs |
Lorenzo Lamberti et.al. |
2407.12675v1 |
null |
2024-07-17 |
Enhancing the Utility of Privacy-Preserving Cancer Classification using Synthetic Data |
Richard Osuala et.al. |
2407.12669v1 |
null |
2024-07-17 |
Is That Rain? Understanding Effects on Visual Odometry Performance for Autonomous UAVs and Efficient DNN-based Rain Classification at the Edge |
Andrea Albanese et.al. |
2407.12663v1 |
null |
2024-07-16 |
Motion-Oriented Compositional Neural Radiance Fields for Monocular Dynamic Human Modeling |
Jaehyeok Kim et.al. |
2407.11962v1 |
null |
2024-07-16 |
A Transformer-based Approach for Augmenting Software Engineering Chatbots Datasets |
Ahmad Abdellatif et.al. |
2407.11955v1 |
null |
2024-07-16 |
Gated Temporal Diffusion for Stochastic Long-Term Dense Anticipation |
Olga Zatsarynna et.al. |
2407.11954v1 |
null |
2024-07-16 |
Temporally Consistent Stereo Matching |
Jiaxi Zeng et.al. |
2407.11950v1 |
link |
2024-07-17 |
Hierarchical Separable Video Transformer for Snapshot Compressive Imaging |
Ping Wang et.al. |
2407.11946v2 |
link |
2024-07-16 |
Tackling Oversmoothing in GNN via Graph Sparsification: A Truss-based Approach |
Tanvir Hossain et.al. |
2407.11928v1 |
null |
2024-07-16 |
The Strength of Bisymmetric Modes in SDSS-IV/MaNGA Barred Galaxy Kinematics |
Brian DiGiorgio Zanger et.al. |
2407.11908v1 |
null |
2024-07-16 |
GraphFM: A Scalable Framework for Multi-Graph Pretraining |
Divyansha Lachi et.al. |
2407.11907v1 |
null |
2024-07-16 |
SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge |
Hao Ding et.al. |
2407.11906v1 |
null |
2024-07-16 |
Automated production of batched unclonable micro-patterns anti-counterfeiting labels with strong robustness and rapid recognition speed |
Yuzheng He et.al. |
2407.11886v1 |
null |
2024-07-15 |
No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations |
Walter Simoncini et.al. |
2407.10964v1 |
link |
2024-07-15 |
InVi: Object Insertion In Videos Using Off-the-Shelf Diffusion Models |
Nirat Saini et.al. |
2407.10958v1 |
null |
2024-07-15 |
MMM: Multilingual Mutual Reinforcement Effect Mix Datasets & Test with Open-domain Information Extraction Large Language Models |
Chengguang Gan et.al. |
2407.10953v1 |
null |
2024-07-15 |
IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth Generation |
Yuanhao Zhai et.al. |
2407.10937v1 |
link |
2024-07-15 |
Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together |
Dilara Soylu et.al. |
2407.10930v1 |
null |
2024-07-15 |
In-Loop Filtering via Trained Look-Up Tables |
Zhuoyuan Li et.al. |
2407.10926v1 |
null |
2024-07-15 |
A Dual-Attention Aware Deep Convolutional Neural Network for Early Alzheimer's Detection |
Pandiyaraju V et.al. |
2407.10921v1 |
null |
2024-07-16 |
DataDream: Few-shot Guided Dataset Generation |
Jae Myung Kim et.al. |
2407.10910v2 |
link |
2024-07-15 |
Interpreting Hand gestures using Object Detection and Digits Classification |
Sangeetha K et.al. |
2407.10902v1 |
null |
2024-07-15 |
Leveraging Multimodal CycleGAN for the Generation of Anatomically Accurate Synthetic CT Scans from MRIs |
Leonardo Crespi et.al. |
2407.10888v1 |
null |
2024-07-12 |
Non-Hermitian Origin of Wannier Localizability and Detachable Topological Boundary States |
Daichi Nakamura et.al. |
2407.09458v1 |
null |
2024-07-12 |
Let Me DeCode You: Decoder Conditioning with Tabular Data |
Tomasz Szczepański et.al. |
2407.09437v1 |
link |
2024-07-12 |
Rethinking temporal self-similarity for repetitive action counting |
Yanan Luo et.al. |
2407.09431v1 |
null |
2024-07-12 |
TelecomGPT: A Framework to Build Telecom-Specfic Large Language Models |
Hang Zou et.al. |
2407.09424v1 |
null |
2024-07-12 |
A grid of self-consistent MSG (MARCS-StaticWeather-GGchem) cool stellar, sub-stellar, and exoplanetary model atmospheres |
Uffe G. Jørgensen et.al. |
2407.09397v1 |
null |
2024-07-12 |
Open-Canopy: A Country-Scale Benchmark for Canopy Height Estimation at Very High Resolution |
Fajwel Fogel et.al. |
2407.09392v1 |
link |
2024-07-12 |
Radiance Fields from Photons |
Sacha Jungerman et.al. |
2407.09386v1 |
null |
2024-07-12 |
Reshaping the Online Data Buffering and Organizing Mechanism for Continual Test-Time Adaptation |
Zhilin Zhu et.al. |
2407.09367v1 |
link |
2024-07-12 |
Novel clustered federated learning based on local loss |
Endong Gu et.al. |
2407.09360v1 |
link |
2024-07-12 |
Imaging Interiors: An Implicit Solution to Electromagnetic Inverse Scattering Problems |
Ziyuan Luo et.al. |
2407.09352v1 |
null |
2024-07-11 |
Video Diffusion Alignment via Reward Gradients |
Mihir Prabhudesai et.al. |
2407.08737v1 |
link |
2024-07-11 |
Real-Time Anomaly Detection and Reactive Planning with Large Language Models |
Rohan Sinha et.al. |
2407.08735v1 |
null |
2024-07-11 |
WhisperNetV2: SlowFast Siamese Network For Lip-Based Biometrics |
Abdollah Zakeri et.al. |
2407.08717v1 |
null |
2024-07-11 |
Sensor-Aware Classifiers for Energy-Efficient Time Series Applications on IoT Devices |
Dina Hussein et.al. |
2407.08715v1 |
null |
2024-07-11 |
Towards Efficient Deployment of Hybrid SNNs on Neuromorphic and Edge AI Hardware |
James Seekings et.al. |
2407.08704v1 |
null |
2024-07-11 |
Live2Diff: Live Stream Translation via Uni-directional Attention in Video Diffusion Models |
Zhening Xing et.al. |
2407.08701v1 |
null |
2024-07-11 |
ElasticAST: An Audio Spectrogram Transformer for All Length and Resolutions |
Jiu Feng et.al. |
2407.08691v1 |
link |
2024-07-11 |
Generalizable Implicit Motion Modeling for Video Frame Interpolation |
Zujin Guo et.al. |
2407.08680v1 |
null |
2024-07-11 |
Still-Moving: Customized Video Generation without Customized Video Data |
Hila Chefer et.al. |
2407.08674v1 |
null |
2024-07-11 |
NODE-Adapter: Neural Ordinary Differential Equations for Better Vision-Language Reasoning |
Yi Zhang et.al. |
2407.08672v1 |
null |
2024-07-10 |
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models |
Feng Li et.al. |
2407.07895v1 |
link |
2024-07-10 |
Vegetable Peeling: A Case Study in Constrained Dexterous Manipulation |
Tao Chen et.al. |
2407.07884v1 |
null |
2024-07-10 |
Controlling Space and Time with Diffusion Models |
Daniel Watson et.al. |
2407.07860v1 |
null |
2024-07-11 |
Functional Assessment of Cerebral Capillaries using Single Capillary Reporters in Ultrasound Localization Microscopy |
Stephen A Lee et.al. |
2407.07857v2 |
null |
2024-07-10 |
Study on Aspect Ratio Variability toward Robustness of Vision Transformer-based Vehicle Re-identification |
Mei Qiu et.al. |
2407.07842v1 |
null |
2024-07-10 |
Benchmarking Embedding Aggregation Methods in Computational Pathology: A Clinical Data Perspective |
Shengjia Chen et.al. |
2407.07841v1 |
link |
2024-07-10 |
Probe and Prejudice: Classification of compact objects and model comparison using EOS knowledge |
Hauke Koehn et.al. |
2407.07837v1 |
null |
2024-07-10 |
RT-LA-VocE: Real-Time Low-SNR Audio-Visual Speech Enhancement |
Honglie Chen et.al. |
2407.07825v1 |
null |
2024-07-10 |
New Gravitational Wave Discoveries Enabled by Machine Learning |
Alexandra E. Koloniari et.al. |
2407.07820v1 |
null |
2024-07-10 |
The Misclassification Likelihood Matrix: Some Classes Are More Likely To Be Misclassified Than Others |
Daniel Sikar et.al. |
2407.07818v1 |
null |
2024-07-09 |
V-VIPE: Variational View Invariant Pose Embedding |
Mara Levy et.al. |
2407.07092v1 |
null |
2024-07-09 |
Fine-Tuning Linear Layers Only Is a Simple yet Effective Way for Task Arithmetic |
Ruochen Jin et.al. |
2407.07089v1 |
link |
2024-07-09 |
MoSt-DSA: Modeling Motion and Structural Interactions for Direct Multi-Frame Interpolation in DSA Images |
Ziyang Xu et.al. |
2407.07078v1 |
link |
2024-07-09 |
MADE-for-ASD: A Multi-Atlas Deep Ensemble Network for Diagnosing Autism Spectrum Disorder |
Md Rakibul Hasan et.al. |
2407.07076v1 |
null |
2024-07-10 |
CAPformer: Compression-Aware Pre-trained Transformer for Low-Light Image Enhancement |
Wei Wang et.al. |
2407.07056v2 |
null |
2024-07-09 |
Latent Space Imaging |
Matheus Souza et.al. |
2407.07052v1 |
null |
2024-07-09 |
Simple and Interpretable Probabilistic Classifiers for Knowledge Graphs |
Christian Riefolo et.al. |
2407.07045v1 |
null |
2024-07-09 |
Free Fermionic Constructions of Heterotic Strings |
Ioannis Florakis et.al. |
2407.07034v1 |
null |
2024-07-09 |
Resolving Sentiment Discrepancy for Multimodal Sentiment Detection via Semantics Completion and Decomposition |
Daiqing Wu et.al. |
2407.07026v1 |
null |
2024-07-09 |
Exploring Scalability of Self-Training for Open-Vocabulary Temporal Action Localization |
Jeongseok Hyun et.al. |
2407.07024v1 |
link |
2024-07-08 |
Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision |
Orr Zohar et.al. |
2407.06189v1 |
link |
2024-07-08 |
Classification of Cellular Automata based on the Hamming distance |
Gaspar Alfaro et.al. |
2407.06175v1 |
null |
2024-07-08 |
The Tug-of-War Between Deepfake Generation and Detection |
Hannah Lee et.al. |
2407.06174v1 |
null |
2024-07-08 |
PanDORA: Casual HDR Radiance Acquisition for Indoor Scenes |
Mohammad Reza Karimi Dastjerdi et.al. |
2407.06150v1 |
null |
2024-07-08 |
Physics-informed machine learning approaches to reactor antineutrino detection |
Sophia Farrell et.al. |
2407.06139v1 |
null |
2024-07-08 |
Depression Detection and Analysis using Large Language Models on Textual and Audio-Visual Modalities |
Avinash Anand et.al. |
2407.06125v1 |
null |
2024-07-08 |
Accelerating Diffusion for SAR-to-Optical Image Translation via Adversarial Consistency Distillation |
Xinyu Bai et.al. |
2407.06095v1 |
null |
2024-07-08 |
ERR@HRI 2024 Challenge: Multimodal Detection of Errors and Failures in Human-Robot Interactions |
Micol Spitale et.al. |
2407.06094v1 |
null |
2024-07-08 |
Artificial Intuition: Efficient Classification of Scientific Abstracts |
Harsh Sakhrani et.al. |
2407.06093v1 |
null |
2024-07-08 |
Assessing Cardiomegaly in Dogs Using a Simple CNN Model |
Nikhil Deekonda et.al. |
2407.06092v1 |
null |
2024-07-05 |
VCoME: Verbal Video Composition with Multimodal Editing Effects |
Weibo Gong et.al. |
2407.04697v1 |
null |
2024-07-05 |
Enhancing Vehicle Re-identification and Matching for Weaving Analysis |
Mei Qiu et.al. |
2407.04688v1 |
null |
2024-07-05 |
Embracing Massive Medical Data |
Yu-Cheng Chou et.al. |
2407.04687v1 |
link |
2024-07-05 |
Is plantar thermography a valid digital biomarker for characterising diabetic foot ulceration risk? |
Akshay Jagadeesh et.al. |
2407.04676v1 |
null |
2024-07-05 |
AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation |
Yuhan Zhu et.al. |
2407.04603v1 |
null |
2024-07-05 |
Multimodal Classification via Modal-Aware Interactive Enhancement |
Qing-Yuan Jiang et.al. |
2407.04587v1 |
null |
2024-07-05 |
A Degree Bound for Planar Functions |
Christof Beierle et.al. |
2407.04570v1 |
null |
2024-07-05 |
Pencils of plane cubics with one base point |
Riccardo Moschetti et.al. |
2407.04569v1 |
null |
2024-07-05 |
Anticipating Solar Flares |
Hugh S. Hudson et.al. |
2407.04567v1 |
null |
2024-07-05 |
Real Time Emotion Analysis Using Deep Learning for Education, Entertainment, and Beyond |
Abhilash Khuntia et.al. |
2407.04560v1 |
null |
2024-07-03 |
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output |
Pan Zhang et.al. |
2407.03320v1 |
link |
2024-07-03 |
Value-Penalized Auxiliary Control from Examples for Learning without Rewards or Demonstrations |
Trevor Ablett et.al. |
2407.03311v1 |
link |
2024-07-03 |
Accelerated Proton Resonance Frequency-based Magnetic Resonance Thermometry by Optimized Deep Learning Method |
Sijie Xu et.al. |
2407.03308v1 |
link |
2024-07-03 |
HoloHisto: End-to-end Gigapixel WSI Segmentation with 4K Resolution Sequential Tokenization |
Yucheng Tang et.al. |
2407.03307v1 |
null |
2024-07-03 |
VCHAR:Variance-Driven Complex Human Activity Recognition framework with Generative Representation |
Yuan Sun et.al. |
2407.03291v1 |
null |
2024-07-03 |
Using Photoplethysmography to Detect Real-time Blood Pressure Changes with a Calibration-free Deep Learning Model |
Jingyuan Hong et.al. |
2407.03274v1 |
null |
2024-07-03 |
Modern Neighborhood Components Analysis: A Deep Tabular Baseline Two Decades Later |
Han-Jia Ye et.al. |
2407.03257v1 |
link |
2024-07-03 |
STF: Sentence Transformer Fine-Tuning For Topic Categorization With Limited Data |
Kheir Eddine Daouadi et.al. |
2407.03253v1 |
null |
2024-07-03 |
ACTRESS: Active Retraining for Semi-supervised Visual Grounding |
Weitai Kang et.al. |
2407.03251v1 |
null |
2024-07-04 |
TieBot: Learning to Knot a Tie from Visual Demonstration through a Real-to-Sim-to-Real Approach |
Weikun Peng et.al. |
2407.03245v2 |
null |
2024-07-02 |
Characterizing the Interpretability of Attention Maps in Digital Pathology |
Tomé Albuquerque et.al. |
2407.02484v1 |
null |
2024-07-02 |
Ensemble of pre-trained language models and data augmentation for hate speech detection from Arabic tweets |
Kheir Eddine Daouadi et.al. |
2407.02448v1 |
null |
2024-07-02 |
PLeaS -- Merging Models with Permutations and Least Squares |
Anshul Nasery et.al. |
2407.02447v1 |
null |
2024-07-02 |
Evaluating the Robustness of Adverse Drug Event Classification Models Using Templates |
Dorothea MacPhail et.al. |
2407.02432v1 |
null |
2024-07-02 |
AXIAL: Attention-based eXplainability for Interpretable Alzheimer's Localized Diagnosis using 2D CNNs on 3D MRI brain scans |
Gabriele Lozupone et.al. |
2407.02418v1 |
link |
2024-07-03 |
Video Watermarking: Safeguarding Your Video from (Unauthorized) Annotations by Video-based LLMs |
Jinmin Li et.al. |
2407.02411v2 |
null |
2024-07-02 |
Tiny-PULP-Dronets: Squeezing Neural Networks for Faster and Lighter Inference on Multi-Tasking Autonomous Nano-Drones |
Lorenzo Lamberti et.al. |
2407.02405v1 |
null |
2024-07-03 |
A neural networks method to search for long transient gravitational waves |
Francesca Attadio et.al. |
2407.02391v2 |
null |
2024-07-02 |
Real HSI-MSI-PAN image dataset for the hyperspectral/multi-spectral/panchromatic image fusion and super-resolution fields |
Shuangliang Li et.al. |
2407.02387v1 |
link |
2024-07-02 |
OpenSlot: Mixed Open-set Recognition with Object-centric Learning |
Xu Yin et.al. |
2407.02386v1 |
null |
2024-06-28 |
Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs |
Sukmin Yun et.al. |
2406.20098v1 |
link |
2024-06-28 |
LLaVolta: Efficient Multi-modal Models via Stage-wise Visual Context Compression |
Jieneng Chen et.al. |
2406.20092v1 |
link |
2024-06-28 |
Minimax And Adaptive Transfer Learning for Nonparametric Classification under Distributed Differential Privacy Constraints |
Arnab Auddy et.al. |
2406.20088v1 |
null |
2024-06-28 |
Extreme horizon equation |
Wojciech Kamiński et.al. |
2406.20068v1 |
null |
2024-06-28 |
Modeling and LQR Control of Insect Sized Flapping Wing Robot |
Daksh Dhingra et.al. |
2406.20061v1 |
null |
2024-06-28 |
Pairwise Difference Learning for Classification |
Mohamed Karim Belaid et.al. |
2406.20031v1 |
link |
2024-06-28 |
On the Trade-off between Flatness and Optimization in Distributed Learning |
Ying Cao et.al. |
2406.20006v1 |
null |
2024-06-28 |
Malaria Cell Detection Using Deep Neural Networks |
Saurabh Sawant et.al. |
2406.20005v1 |
null |
2024-06-28 |
Impact of Initialization on Intra-subject Pediatric Brain MR Image Registration: A Comparative Analysis between SyN ANTs and Deep Learning-Based Approaches |
Andjela Dimitrijevic et.al. |
2406.19943v1 |
link |
2024-07-01 |
GRACE: Graph-Regularized Attentive Convolutional Entanglement with Laplacian Smoothing for Robust DeepFake Video Detection |
Chih-Chung Hsu et.al. |
2406.19941v2 |
link |
2024-06-27 |
ReXTime: A Benchmark Suite for Reasoning-Across-Time in Videos |
Jr-Jen Chen et.al. |
2406.19392v1 |
link |
2024-06-27 |
Fibottention: Inceptive Visual Representation Learning with Diverse Attention Across Heads |
Ali Khaleghi Rahimian et.al. |
2406.19391v1 |
link |
2024-06-27 |
OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding |
Tao Zhang et.al. |
2406.19389v1 |
null |
2024-06-27 |
Mamba or RWKV: Exploring High-Quality and High-Efficiency Segment Anything Model |
Haobo Yuan et.al. |
2406.19369v1 |
null |
2024-06-27 |
IndoToxic2024: A Demographically-Enriched Dataset of Hate Speech and Toxicity Types for Indonesian Language |
Lucky Susanto et.al. |
2406.19349v1 |
null |
2024-06-27 |
Learning Visual Conditioning Tokens to Correct Domain Shift for Fully Test-time Adaptation |
Yushun Tang et.al. |
2406.19341v1 |
null |
2024-06-28 |
LiverUSRecon: Automatic 3D Reconstruction and Volumetry of the Liver with a Few Partial Ultrasound Scans |
Kaushalya Sivayogaraj et.al. |
2406.19336v2 |
null |
2024-06-27 |
PNeRV: A Polynomial Neural Representation for Videos |
Sonam Gupta et.al. |
2406.19299v1 |
null |
2024-06-27 |
Leveraging Contrastive Learning for Enhanced Node Representations in Tokenized Graph Transformers |
Jinsong Chen et.al. |
2406.19258v1 |
null |
2024-06-27 |
Enhancing Video-Language Representations with Structural Spatio-Temporal Alignment |
Hao Fei et.al. |
2406.19255v1 |
null |
2024-06-26 |
Towards Compositionality in Concept Learning |
Adam Stein et.al. |
2406.18534v1 |
link |
2024-06-26 |
MatchTime: Towards Automatic Soccer Game Commentary Generation |
Jiayuan Rao et.al. |
2406.18530v1 |
null |
2024-06-26 |
MultiDiff: Consistent Novel View Synthesis from a Single Image |
Norman Müller et.al. |
2406.18524v1 |
null |
2024-06-26 |
ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation |
Shenghai Yuan et.al. |
2406.18522v1 |
null |
2024-06-27 |
Distinguishing mechanisms of social contagion from local network view |
Elsa Andres et.al. |
2406.18519v2 |
null |
2024-06-26 |
Assessment of Clonal Hematopoiesis of Indeterminate Potential from Cardiac Magnetic Resonance Imaging using Deep Learning in a Cardio-oncology Population |
Sangeon Ryu et.al. |
2406.18508v1 |
null |
2024-06-26 |
Robust Surgical Phase Recognition From Annotation Efficient Supervision |
Or Rubin et.al. |
2406.18481v1 |
null |
2024-06-26 |
Universal Anomaly Detection at the LHC: Transforming Optimal Classifiers and the DDD Method |
Sascha Caron et.al. |
2406.18469v1 |
null |
2024-06-26 |
An Autotuning-based Optimization Framework for Mixed-kernel SVM Classifications in Smart Pixel Datasets and Heterojunction Transistors |
Xingfu Wu et.al. |
2406.18445v1 |
null |
2024-06-26 |
Repeat and Concatenate: 2D to 3D Image Translation with 3D to 3D Generative Modeling |
Abril Corona-Figueroa et.al. |
2406.18422v1 |
null |
2024-06-25 |
Text-Animator: Controllable Visual Text Video Generation |
Lin Liu et.al. |
2406.17777v1 |
null |
2024-06-25 |
MotionBooth: Motion-Aware Customized Text-to-Video Generation |
Jianzong Wu et.al. |
2406.17758v1 |
null |
2024-06-25 |
Benchmarking Deep Learning Models on NVIDIA Jetson Nano for Real-Time Systems: An Empirical Investigation |
Tushar Prasanna Swaminathan et.al. |
2406.17749v1 |
null |
2024-06-25 |
Structured Unrestricted-Rank Matrices for Parameter Efficient Fine-tuning |
Arijit Sehanobish et.al. |
2406.17740v1 |
null |
2024-06-25 |
Mask-Guided Attention U-Net for Enhanced Neonatal Brain Extraction and Image Preprocessing |
Bahram Jafrasteh et.al. |
2406.17709v1 |
link |
2024-06-25 |
SurgeMOD: Translating image-space tissue motions into vision-based surgical forces |
Mikel De Iturrate Reyzabal et.al. |
2406.17707v1 |
link |
2024-06-25 |
Dualities for universal (co)acting Hopf monoids |
Ana Agore et.al. |
2406.17684v1 |
null |
2024-06-25 |
Local-to-Global Cross-Modal Attention-Aware Fusion for HSI-X Semantic Segmentation |
Xuming Zhang et.al. |
2406.17679v1 |
null |
2024-06-25 |
Lifting of locally initial objects and universal (co)acting Hopf algebras |
Ana Agore et.al. |
2406.17677v1 |
null |
2024-06-25 |
Brain Tumor Classification using Vision Transformer with Selective Cross-Attention Mechanism and Feature Calibration |
Mohammad Ali Labbaf Khaniki et.al. |
2406.17670v1 |
null |
2024-06-24 |
StableNormal: Reducing Diffusion Variance for Stable and Sharp Normal |
Chongjie Ye et.al. |
2406.16864v1 |
null |
2024-06-24 |
FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models |
Haonan Qiu et.al. |
2406.16863v1 |
link |
2024-06-24 |
Dreamitate: Real-World Visuomotor Policy Learning via Video Generation |
Junbang Liang et.al. |
2406.16862v1 |
null |
2024-06-24 |
Long Context Transfer from Language to Vision |
Peiyuan Zhang et.al. |
2406.16852v1 |
link |
2024-06-24 |
Unsupervised Domain Adaptation for Pediatric Brain Tumor Segmentation |
Jingru Fu et.al. |
2406.16848v1 |
null |
2024-06-24 |
Exploring Factual Entailment with NLI: A News Media Study |
Guy Mor-Lan et.al. |
2406.16842v1 |
null |
2024-06-24 |
A Certifiable Algorithm for Simultaneous Shape Estimation and Object Tracking |
Lorenzo Shaikewitz et.al. |
2406.16837v1 |
null |
2024-06-24 |
USDC: A Dataset of $\underline{U}$ser $\underline{S}$tance and $\underline{D}$ogmatism in Long $\underline{C}$onversations |
Mounika Marreddy et.al. |
2406.16833v1 |
null |
2024-06-24 |
The classification of simple complex Lie superalgebras of polynomial vector fields and their deformations |
Dimitry Leites et.al. |
2406.16760v1 |
null |
2024-06-24 |
The MRI Scanner as a Diagnostic: Image-less Active Sampling |
Yuning Du et.al. |
2406.16754v1 |
null |
2024-06-21 |
Full-Scale Indexing and Semantic Annotation of CT Imaging: Boosting FAIRness |
Hannes Ulrich et.al. |
2406.15340v1 |
null |
2024-06-21 |
Image Conductor: Precision Control for Interactive Video Synthesis |
Yaowei Li et.al. |
2406.15339v1 |
null |
2024-06-21 |
An End-to-End, Segmentation-Free, Arabic Handwritten Recognition Model on KHATT |
Sondos Aabed et.al. |
2406.15329v1 |
null |
2024-06-21 |
Fine-grained Attention in Hierarchical Transformers for Tabular Time-series |
Raphael Azorin et.al. |
2406.15327v1 |
link |
2024-06-21 |
NLP-KG: A System for Exploratory Search of Scientific Literature in Natural Language Processing |
Tim Schopf et.al. |
2406.15294v1 |
link |
2024-06-21 |
Towards Fine-Grained Citation Evaluation in Generated Text: A Comparative Analysis of Faithfulness Metrics |
Weijia Zhang et.al. |
2406.15264v1 |
null |
2024-06-24 |
VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation |
Xuan He et.al. |
2406.15252v2 |
null |
2024-06-21 |
Retrieval Augmented Zero-Shot Text Classification |
Tassallah Abdullahi et.al. |
2406.15241v1 |
null |
2024-06-21 |
Model Equivalences |
Michael Benedikt et.al. |
2406.15235v1 |
null |
2024-06-21 |
Rate-Splitting Multiple Access for Overloaded Multi-group Multicast: A First Experimental Study |
Xinze Lyu et.al. |
2406.15217v1 |
null |
2024-06-20 |
A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models |
Xincheng Shuai et.al. |
2406.14555v1 |
link |
2024-06-21 |
Advancing Fine-Grained Classification by Structure and Subject Preserving Augmentation |
Eyal Michaeli et.al. |
2406.14551v2 |
link |
2024-06-20 |
IRASim: Learning Interactive Real-Robot Action Simulators |
Fangqi Zhu et.al. |
2406.14540v1 |
null |
2024-06-20 |
Epicardium Prompt-guided Real-time Cardiac Ultrasound Frame-to-volume Registration |
Long Lei et.al. |
2406.14534v1 |
link |
2024-06-20 |
Local symmetries in partially ordered sets |
Christoph Minz et.al. |
2406.14533v1 |
null |
2024-06-20 |
Fantastic Copyrighted Beasts and How (Not) to Generate Them |
Luxi He et.al. |
2406.14526v1 |
null |
2024-06-20 |
MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding |
Xinyu Fang et.al. |
2406.14515v1 |
link |
2024-06-20 |
V-LASIK: Consistent Glasses-Removal from Videos Using Synthetic Data |
Rotem Shalev-Arkushin et.al. |
2406.14510v1 |
null |
2024-06-20 |
LLaSA: Large Multimodal Agent for Human Activity Analysis Through Wearable Sensors |
Sheikh Asif Imran et.al. |
2406.14498v1 |
link |
2024-06-20 |
African or European Swallow? Benchmarking Large Vision-Language Models for Fine-Grained Object Classification |
Gregor Geigle et.al. |
2406.14496v1 |
null |
2024-06-18 |
DrVideo: Document Retrieval Based Long Video Understanding |
Ziyu Ma et.al. |
2406.12846v1 |
null |
2024-06-18 |
LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging |
Jinuk Kim et.al. |
2406.12837v1 |
link |
2024-06-18 |
GroPrompt: Efficient Grounded Prompting and Adaptation for Referring Video Object Segmentation |
Ci-Siang Lin et.al. |
2406.12834v1 |
null |
2024-06-18 |
VIA: A Spatiotemporal Video Adaptation Framework for Global and Local Video Editing |
Jing Gu et.al. |
2406.12831v1 |
null |
2024-06-18 |
Neural Approximate Mirror Maps for Constrained Diffusion Models |
Berthy T. Feng et.al. |
2406.12816v1 |
null |
2024-06-18 |
Privacy Preserving Federated Learning in Medical Imaging with Uncertainty Estimation |
Nikolas Koutsoubis et.al. |
2406.12815v1 |
link |
2024-06-18 |
Probabilistic Temporal Prediction of Continuous Disease Trajectories and Treatment Effects Using Neural SDEs |
Joshua Durso-Finley et.al. |
2406.12807v1 |
null |
2024-06-18 |
Composited-Nested-Learning with Data Augmentation for Nested Named Entity Recognition |
Xingming Liao et.al. |
2406.12779v1 |
null |
2024-06-18 |
Medvedev degrees of subshifts on groups |
Sebastián Barbieri et.al. |
2406.12777v1 |
null |
2024-06-18 |
Latent Intuitive Physics: Learning to Transfer Hidden Physics from A 3D Video |
Xiangming Zhu et.al. |
2406.12769v1 |
null |
2024-06-17 |
Scaling the Codebook Size of VQGAN to 100,000 with a Utilization Rate of 99% |
Lei Zhu et.al. |
2406.11837v1 |
link |
2024-06-17 |
Spectral Introspection Identifies Group Training Dynamics in Deep Neural Networks for Neuroimaging |
Bradley T. Baker et.al. |
2406.11825v1 |
null |
2024-06-17 |
Infinigen Indoors: Photorealistic Indoor Scenes using Procedural Generation |
Alexander Raistrick et.al. |
2406.11824v1 |
null |
2024-06-17 |
VideoLLM-online: Online Video Large Language Model for Streaming Video |
Joya Chen et.al. |
2406.11816v1 |
null |
2024-06-17 |
Faces of Experimental Pain: Transferability of Deep Learned Heat Pain Features to Electrical Pain |
Pooja Prajod et.al. |
2406.11808v1 |
null |
2024-06-17 |
Mix-Domain Contrastive Learning for Unpaired H&E-to-IHC Stain Translation |
Song Wang et.al. |
2406.11799v1 |
null |
2024-06-17 |
CELL your Model: Contrastive Explanation Methods for Large Language Models |
Ronny Luss et.al. |
2406.11785v1 |
null |
2024-06-17 |
Task Me Anything |
Jieyu Zhang et.al. |
2406.11775v1 |
link |
2024-06-17 |
Domain Generalization for In-Orbit 6D Pose Estimation |
Antoine Legrand et.al. |
2406.11743v1 |
null |
2024-06-17 |
Lightweight Model Pre-training via Language Guided Knowledge Distillation |
Mingsheng Li et.al. |
2406.11689v1 |
link |
2024-06-14 |
VideoGUI: A Benchmark for GUI Automation from Instructional Videos |
Kevin Qinghong Lin et.al. |
2406.10227v1 |
null |
2024-06-14 |
Short Film Dataset (SFD): A Benchmark for Story-Level Video Understanding |
Ridouane Ghermi et.al. |
2406.10221v1 |
null |
2024-06-14 |
SSTFB: Leveraging self-supervised pretext learning and temporal self-attention with feature branching for real-time video polyp segmentation |
Ziang Xu et.al. |
2406.10200v1 |
null |
2024-06-14 |
CarLLaVA: Vision language models for camera-only closed-loop driving |
Katrin Renz et.al. |
2406.10165v1 |
null |
2024-06-14 |
Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition |
Guinan Li et.al. |
2406.10152v1 |
null |
2024-06-14 |
Training-free Camera Control for Video Generation |
Chen Hou et.al. |
2406.10126v1 |
null |
2024-06-14 |
Modified Risk Formulation for Improving the Prediction of Knee Osteoarthritis Progression |
Haresh Rengaraj Rajamohan et.al. |
2406.10119v1 |
null |
2024-06-14 |
ECGMamba: Towards Efficient ECG Classification with BiSSM |
Yupeng Qiang et.al. |
2406.10098v1 |
null |
2024-06-14 |
Biomarker based Cancer Classification using an Ensemble with Pre-trained Models |
Chongmin Lee et.al. |
2406.10087v1 |
null |
2024-06-14 |
On the Evaluation of Speech Foundation Models for Spoken Language Understanding |
Siddhant Arora et.al. |
2406.10083v1 |
null |
2024-06-13 |
VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding |
Muhammad Maaz et.al. |
2406.09418v1 |
link |
2024-06-13 |
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels |
Duy-Kien Nguyen et.al. |
2406.09415v1 |
null |
2024-06-13 |
CodedEvents: Optimal Point-Spread-Function Engineering for 3D-Tracking with Event Cameras |
Sachin Shah et.al. |
2406.09409v1 |
null |
2024-06-13 |
Instruct 4D-to-4D: Editing 4D Scenes as Pseudo-3D Scenes Using 2D Diffusion |
Linzhan Mou et.al. |
2406.09402v1 |
null |
2024-06-13 |
OmniTokenizer: A Joint Image-Video Tokenizer for Visual Generation |
Junke Wang et.al. |
2406.09399v1 |
link |
2024-06-13 |
Too Many Frames, not all Useful:Efficient Strategies for Long-Form Video QA |
Jongwoo Park et.al. |
2406.09396v1 |
null |
2024-06-13 |
LLAVIDAL: Benchmarking Large Language Vision Models for Daily Activities of Living |
Rajatsubhra Chakraborty et.al. |
2406.09390v1 |
null |
2024-06-13 |
Sagiri: Low Dynamic Range Image Enhancement with Generative Diffusion Prior |
Baiang Li et.al. |
2406.09389v1 |
null |
2024-06-13 |
Exploring the Spectrum of Visio-Linguistic Compositionality and Recognition |
Youngtaek Oh et.al. |
2406.09388v1 |
link |
2024-06-13 |
SimGen: Simulator-conditioned Driving Scene Generation |
Yunsong Zhou et.al. |
2406.09386v1 |
null |
2024-06-12 |
On Evaluating Adversarial Robustness of Volumetric Medical Segmentation Models |
Hashmat Shadab Malik et.al. |
2406.08486v1 |
link |
2024-06-12 |
RMem: Restricted Memory Banks Improve Video Object Segmentation |
Junbao Zhou et.al. |
2406.08476v1 |
null |
2024-06-12 |
AToM-Bot: Embodied Fulfillment of Unspoken Human Needs with Affective Theory of Mind |
Wei Ding et.al. |
2406.08455v1 |
null |
2024-06-12 |
Transformation-Dependent Adversarial Attacks |
Yaoteng Tan et.al. |
2406.08443v1 |
null |
2024-06-12 |
A Sticker is Worth a Thousand Words: Characterizing the Use of Stickers in WhatsApp Political Groups in Brazil |
Philipe Melo et.al. |
2406.08429v1 |
null |
2024-06-12 |
Improving Noise Robustness through Abstractions and its Impact on Machine Learning |
Alfredo Ibias et.al. |
2406.08428v1 |
null |
2024-06-12 |
OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text |
Qingyun Li et.al. |
2406.08418v1 |
link |
2024-06-13 |
MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos |
Xuehai He et.al. |
2406.08407v2 |
link |
2024-06-12 |
Eyes Wide Unshut: Unsupervised Mistake Detection in Egocentric Video by Detecting Unpredictable Gaze |
Michele Mazzamuto et.al. |
2406.08379v1 |
null |
2024-06-12 |
2.5D Multi-view Averaging Diffusion Model for 3D Medical Image Translation: Application to Low-count PET Reconstruction with CT-less Attenuation Correction |
Tianqi Chen et.al. |
2406.08374v1 |
null |
2024-06-11 |
Blur-aware Spatio-temporal Sparse Transformer for Video Deblurring |
Huicong Zhang et.al. |
2406.07551v1 |
link |
2024-06-11 |
Image and Video Tokenization with Binary Spherical Quantization |
Yue Zhao et.al. |
2406.07548v1 |
link |
2024-06-11 |
Zero-shot Image Editing with Reference Imitation |
Xi Chen et.al. |
2406.07547v1 |
null |
2024-06-11 |
Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance |
Kuan Heng Lin et.al. |
2406.07540v1 |
null |
2024-06-11 |
BAKU: An Efficient Transformer for Multi-Task Policy Learning |
Siddhant Haldar et.al. |
2406.07539v1 |
null |
2024-06-11 |
Transforming a rare event search into a not-so-rare event search in real-time with deep learning-based object detection |
J. Schueler et.al. |
2406.07538v1 |
null |
2024-06-11 |
Towards Fundamentally Scalable Model Selection: Asymptotically Fast Update and Selection |
Wenxiao Wang et.al. |
2406.07536v1 |
null |
2024-06-11 |
Dynamics of the non-radial energy-critical inhomogeneous NLS |
Carlos M. Guzmán et.al. |
2406.07535v1 |
null |
2024-06-11 |
Beyond Model Collapse: Scaling Up with Synthesized Data Requires Reinforcement |
Yunzhen Feng et.al. |
2406.07515v1 |
null |
2024-06-11 |
Understanding Visual Concepts Across Models |
Brandon Trabucco et.al. |
2406.07506v1 |
link |
2024-06-10 |
NaRCan: Natural Refined Canonical Image with Integration of Diffusion Prior for Video Editing |
Ting-Hsuan Chen et.al. |
2406.06523v1 |
null |
2024-06-10 |
Data Augmentation for Multivariate Time Series Classification: An Experimental Study |
Romain Ilbert et.al. |
2406.06518v1 |
null |
2024-06-10 |
Merlin: A Vision Language Foundation Model for 3D Computed Tomography |
Louis Blankemeier et.al. |
2406.06512v1 |
null |
2024-06-10 |
Monkey See, Monkey Do: Harnessing Self-attention in Motion Diffusion for Zero-shot Motion Transfer |
Sigal Raab et.al. |
2406.06508v1 |
link |
2024-06-10 |
Equivariant Neural Tangent Kernels |
Philipp Misof et.al. |
2406.06504v1 |
null |
2024-06-10 |
Viscous shock fluctuations in KPZ |
Alexander Dunlap et.al. |
2406.06502v1 |
null |
2024-06-10 |
NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative |
Asmar Nadeem et.al. |
2406.06499v1 |
null |
2024-06-10 |
Demonstrating HumanTHOR: A Simulation Platform and Benchmark for Human-Robot Collaboration in a Shared Workspace |
Chenxu Wang et.al. |
2406.06498v1 |
null |
2024-06-10 |
Graph-Based Bidirectional Transformer Decision Threshold Adjustment Algorithm for Class-Imbalanced Molecular Data |
Nicole Hayes et.al. |
2406.06479v1 |
null |
2024-06-10 |
DiffAudit: Auditing Privacy Practices of Online Services for Children and Adolescents |
Olivia Figueira et.al. |
2406.06473v1 |
null |
2024-06-07 |
DVOS: Self-Supervised Dense-Pattern Video Object Segmentation |
Keyhan Najafian et.al. |
2406.05131v1 |
null |
2024-06-07 |
Compositional Curvature Bounds for Deep Neural Networks |
Taha Entesari et.al. |
2406.05119v1 |
null |
2024-06-07 |
Large Generative Graph Models |
Yu Wang et.al. |
2406.05109v1 |
null |
2024-06-07 |
A Novel Time Series-to-Image Encoding Approach for Weather Phenomena Classification |
Christian Giannetti et.al. |
2406.05096v1 |
null |
2024-06-10 |
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at z = 2.9 with JWST |
J. D. R. Pierel et.al. |
2406.05089v2 |
null |
2024-06-07 |
CoNo: Consistency Noise Injection for Tuning-free Long Video Diffusion |
Xingrui Wang et.al. |
2406.05082v1 |
null |
2024-06-10 |
Discovery of a Relativistic Stripped Envelope Type Ic-BL Supernova at z = 2.83 with JWST |
M. R. Siebert et.al. |
2406.05076v2 |
null |
2024-06-07 |
Diving Deep into the Motion Representation of Video-Text Models |
Chinmaya Devaraj et.al. |
2406.05075v1 |
null |
2024-06-07 |
Hibou: A Family of Foundational Vision Transformers for Pathology |
Dmitry Nechaev et.al. |
2406.05074v1 |
null |
2024-06-07 |
Classification Metrics for Image Explanations: Towards Building Reliable XAI-Evaluations |
Benjamin Fresz et.al. |
2406.05068v1 |
link |
2024-06-06 |
Verbalized Machine Learning: Revisiting Machine Learning with Language Models |
Tim Z. Xiao et.al. |
2406.04344v1 |
null |
2024-06-07 |
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion |
Fangfu Liu et.al. |
2406.04338v2 |
null |
2024-06-06 |
Parameter-Inverted Image Pyramid Networks |
Xizhou Zhu et.al. |
2406.04330v1 |
link |
2024-06-06 |
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions |
Lin Chen et.al. |
2406.04325v1 |
null |
2024-06-06 |
SF-V: Single Forward Video Generation Model |
Zhixing Zhang et.al. |
2406.04324v1 |
null |
2024-06-06 |
ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories |
Qianlan Yang et.al. |
2406.04323v1 |
null |
2024-06-06 |
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling |
Zeyue Tian et.al. |
2406.04321v1 |
link |
2024-06-06 |
Chimera: Effectively Modeling Multivariate Time Series with 2-Dimensional State Space Models |
Ali Behrouz et.al. |
2406.04320v1 |
null |
2024-06-06 |
Adaptive Sampling of k-Space in Magnetic Resonance for Rapid Pathology Prediction |
Chen-Yu Yen et.al. |
2406.04318v1 |
null |
2024-06-06 |
Regularized KL-Divergence for Well-Defined Function-Space Variational Inference in Bayesian neural networks |
Tristan Cinquin et.al. |
2406.04317v1 |
null |
2024-06-05 |
Grokking Modular Polynomials |
Darshil Doshi et.al. |
2406.03495v1 |
null |
2024-06-05 |
The Logarithmic Memristor-Based Bayesian Machine |
Clément Turck et.al. |
2406.03492v1 |
null |
2024-06-05 |
Convolutional Neural Networks and Vision Transformers for Fashion MNIST Classification: A Literature Review |
Sonia Bbouzidi et.al. |
2406.03478v1 |
null |
2024-06-05 |
Node-wise Filtering in Graph Neural Networks: A Mixture of Experts Approach |
Haoyu Han et.al. |
2406.03464v1 |
null |
2024-06-05 |
Polarization Wavefront Lidar: Learning Large Scene Reconstruction from Polarized Wavefronts |
Dominik Scheuble et.al. |
2406.03461v1 |
null |
2024-06-05 |
FILS: Self-Supervised Video Feature Prediction In Semantic Language Space |
Mona Ahmadian et.al. |
2406.03447v1 |
null |
2024-06-05 |
Text-to-Events: Synthetic Event Camera Streams from Conditional Text Input |
Joachim Ott et.al. |
2406.03439v1 |
null |
2024-06-05 |
Stabilizing massless fields with fluxes in Landau-Ginzburg models |
Katrin Becker et.al. |
2406.03435v1 |
null |
2024-06-05 |
Computation-Efficient Era: A Comprehensive Survey of State Space Models in Medical Image Analysis |
Moein Heidari et.al. |
2406.03430v1 |
link |
2024-06-05 |
Post-hoc Part-prototype Networks |
Andong Tan et.al. |
2406.03421v1 |
null |
2024-06-05 |
Enhancing Temporal Consistency in Video Editing by Reconstructing Videos with 3D Gaussian Splatting |
Inkyu Shin et.al. |
2406.02541v2 |
null |
2024-06-04 |
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation |
Tianchen Zhao et.al. |
2406.02540v1 |
null |
2024-06-04 |
Enhancing predictive imaging biomarker discovery through treatment effect analysis |
Shuhan Xiao et.al. |
2406.02534v1 |
null |
2024-06-04 |
ReLUs Are Sufficient for Learning Implicit Neural Representations |
Joseph Shenouda et.al. |
2406.02529v1 |
link |
2024-06-04 |
RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots |
Soroush Nasiriany et.al. |
2406.02523v1 |
null |
2024-06-04 |
DDGS-CT: Direction-Disentangled Gaussian Splatting for Realistic Volume Rendering |
Zhongpai Gao et.al. |
2406.02518v1 |
null |
2024-06-04 |
V-Express: Conditional Dropout for Progressive Training of Portrait Video Generation |
Cong Wang et.al. |
2406.02511v1 |
null |
2024-06-04 |
CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation |
Dejia Xu et.al. |
2406.02509v1 |
null |
2024-06-04 |
Endomorphisms of Artin groups of type $\tilde A_n$ |
Luis Paris et.al. |
2406.02484v1 |
null |
2024-06-04 |
Inpainting Pathology in Lumbar Spine MRI with Latent Diffusion |
Colin Hansen et.al. |
2406.02477v1 |
null |
2024-05-31 |
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis |
Chaoyou Fu et.al. |
2405.21075v1 |
null |
2024-05-31 |
Generalization Beyond Data Imbalance: A Controlled Study on CLIP for Transferable Insights |
Xin Wen et.al. |
2405.21070v1 |
link |
2024-05-31 |
You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet |
Zhen Qin et.al. |
2405.21022v1 |
null |
2024-05-31 |
Beyond Conventional Parametric Modeling: Data-Driven Framework for Estimation and Prediction of Time Activity Curves in Dynamic PET Imaging |
Niloufar Zakariaei et.al. |
2405.21021v1 |
null |
2024-05-31 |
The classification of dp-minimal integral domains |
Christian d'Elbée et.al. |
2405.21014v1 |
null |
2024-05-31 |
Early Stopping Criteria for Training Generative Adversarial Networks in Biomedical Imaging |
Muhammad Muneeb Saad et.al. |
2405.20987v1 |
null |
2024-05-31 |
PUAL: A Classifier on Trifurcate Positive-Unlabeled Data |
Xiaoke Wang et.al. |
2405.20970v1 |
null |
2024-05-31 |
Aligning Multiclass Neural Network Classifier Criterion with Task Performance via $F_β$-Score |
Nathan Tsoi et.al. |
2405.20954v1 |
null |
2024-05-31 |
Standard model of electromagnetism and chirality in crystals |
R. Winkler et.al. |
2405.20940v1 |
null |
2024-05-31 |
MALT: Multi-scale Action Learning Transformer for Online Action Detection |
Zhipeng Yang et.al. |
2405.20892v1 |
null |
2024-05-30 |
MotionLLM: Understanding Human Behaviors from Human Motions and Videos |
Ling-Hao Chen et.al. |
2405.20340v1 |
null |
2024-05-30 |
OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving |
Lening Wang et.al. |
2405.20337v1 |
link |
2024-05-30 |
VividDream: Generating 3D Scene with Ambient Dynamics |
Yao-Chih Lee et.al. |
2405.20334v1 |
null |
2024-05-30 |
SurgiTrack: Fine-Grained Multi-Class Multi-Tool Tracking in Surgical Videos |
Chinedu Innocent Nwoye et.al. |
2405.20333v1 |
null |
2024-05-31 |
4DHands: Reconstructing Interactive Hands in 4D with Transformers |
Dixuan Lin et.al. |
2405.20330v2 |
null |
2024-05-30 |
MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion |
Shuyuan Tu et.al. |
2405.20325v1 |
null |
2024-05-30 |
Vision-based Manipulation from Single Human Video with Open-World Object Graphs |
Yifeng Zhu et.al. |
2405.20321v1 |
null |
2024-05-30 |
Improving the Training of Rectified Flows |
Sangyun Lee et.al. |
2405.20320v1 |
link |
2024-05-30 |
CausalQuest: Collecting Natural Causal Questions for AI Agents |
Roberto Ceraolo et.al. |
2405.20318v1 |
link |
2024-05-30 |
Can't make an Omelette without Breaking some Eggs: Plausible Action Anticipation using Large Video-Language Models |
Himangi Mittal et.al. |
2405.20305v1 |
null |
2024-05-29 |
X-VILA: Cross-Modality Alignment for Large Language Model |
Hanrong Ye et.al. |
2405.19335v1 |
null |
2024-05-29 |
LLMs Meet Multimodal Generation and Editing: A Survey |
Yingqing He et.al. |
2405.19334v1 |
link |
2024-05-29 |
Multi-Modal Generative Embedding Model |
Feipeng Ma et.al. |
2405.19333v1 |
null |
2024-05-29 |
NPGA: Neural Parametric Gaussian Avatars |
Simon Giebenhain et.al. |
2405.19331v1 |
null |
2024-05-29 |
Normative Modules: A Generative Agent Architecture for Learning Norms that Supports Multi-Agent Cooperation |
Atrisha Sarkar et.al. |
2405.19328v1 |
null |
2024-05-29 |
DGD: Dynamic 3D Gaussians Distillation |
Isaac Labe et.al. |
2405.19321v1 |
null |
2024-05-29 |
Real-Time Environment Condition Classification for Autonomous Vehicles |
Marco Introvigne et.al. |
2405.19305v1 |
null |
2024-05-29 |
Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare |
Hanwei Zhu et.al. |
2405.19298v1 |
null |
2024-05-29 |
Archetype-Based Redshift Estimation for the Dark Energy Spectroscopic Instrument Survey |
Abhijeet Anand et.al. |
2405.19288v1 |
null |
2024-05-29 |
A study on the adequacy of common IQA measures for medical images |
Anna Breger et.al. |
2405.19224v1 |
null |
2024-05-28 |
Classifying Overlapping Gaussian Mixtures in High Dimensions: From Optimal Classifiers to Neural Nets |
Khen Cohen et.al. |
2405.18427v1 |
null |
2024-05-28 |
GFlow: Recovering 4D World from Monocular Video |
Shizun Wang et.al. |
2405.18426v1 |
null |
2024-05-28 |
Hierarchical World Models as Visual Whole-Body Humanoid Controllers |
Nicklas Hansen et.al. |
2405.18418v1 |
null |
2024-05-28 |
3D StreetUnveiler with Semantic-Aware 2DGS |
Jingwei Xu et.al. |
2405.18416v1 |
null |
2024-05-28 |
Why are Visually-Grounded Language Models Bad at Image Classification? |
Yuhui Zhang et.al. |
2405.18415v1 |
link |
2024-05-28 |
Towards a Sampling Theory for Implicit Neural Representations |
Mahrokh Najaf et.al. |
2405.18410v1 |
null |
2024-05-28 |
Phased Consistency Model |
Fu-Yun Wang et.al. |
2405.18407v1 |
null |
2024-05-28 |
RACCooN: Remove, Add, and Change Video Content with Auto-Generated Narratives |
Jaehong Yoon et.al. |
2405.18406v1 |
null |
2024-05-28 |
MMCTAgent: Multi-modal Critical Thinking Agent Framework for Complex Visual Reasoning |
Somnath Kumar et.al. |
2405.18358v1 |
null |
2024-05-28 |
Universal and Extensible Language-Vision Models for Organ Segmentation and Tumor Detection from Abdominal Computed Tomography |
Jie Liu et.al. |
2405.18356v1 |
link |
2024-05-27 |
Matryoshka Multimodal Models |
Mu Cai et.al. |
2405.17430v1 |
null |
2024-05-27 |
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models |
Chankyu Lee et.al. |
2405.17428v1 |
null |
2024-05-27 |
MoSca: Dynamic Gaussian Fusion from Casual Videos via 4D Motion Scaffolds |
Jiahui Lei et.al. |
2405.17421v1 |
null |
2024-05-27 |
Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control |
Zhengfei Kuang et.al. |
2405.17414v1 |
null |
2024-05-27 |
Enhancing Music Genre Classification through Multi-Algorithm Analysis and User-Friendly Visualization |
Navin Kamuni et.al. |
2405.17413v1 |
null |
2024-05-27 |
The Peripatetic Hater: Predicting Movement Among Hate Subreddits |
Daniel Hickey et.al. |
2405.17410v1 |
null |
2024-05-27 |
Human4DiT: Free-view Human Video Generation with 4D Diffusion Transformer |
Ruizhi Shao et.al. |
2405.17405v1 |
null |
2024-05-27 |
Spectral Greedy Coresets for Graph Neural Networks |
Mucong Ding et.al. |
2405.17404v1 |
null |
2024-05-27 |
Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability |
Shenyuan Gao et.al. |
2405.17398v1 |
link |
2024-05-27 |
Non-Unitary Quantum Machine Learning |
Jamie Heredge et.al. |
2405.17388v1 |
null |
2024-05-24 |
Canonical Variates in Wasserstein Metric Space |
Jia Li et.al. |
2405.15768v1 |
null |
2024-05-24 |
Scaling Laws for Discriminative Classification in Large Language Models |
Dean Wyatte et.al. |
2405.15765v1 |
null |
2024-05-24 |
InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation |
Yuchi Wang et.al. |
2405.15758v1 |
link |
2024-05-24 |
Looking Backward: Streaming Video-to-Video Translation with Feature Banks |
Feng Liang et.al. |
2405.15757v1 |
link |
2024-05-24 |
Characterizing Discourse Group Roles in Inquiry-based University Science Labs |
Tong Wan et.al. |
2405.15746v1 |
null |
2024-05-24 |
Hierarchical Uncertainty Exploration via Feedforward Posterior Trees |
Elias Nehme et.al. |
2405.15719v1 |
null |
2024-05-24 |
EmpathicStories++: A Multimodal Dataset for Empathy towards Personal Experiences |
Jocelyn Shen et.al. |
2405.15708v1 |
null |
2024-05-24 |
Sums: Sniffing Unknown Multiband Signals under Low Sampling Rates |
Jinbo Peng et.al. |
2405.15705v1 |
null |
2024-05-24 |
realSEUDO for real-time calcium imaging analysis |
Iuliia Dmitrieva et.al. |
2405.15701v1 |
null |
2024-05-24 |
UNION: Unsupervised 3D Object Detection using Object Appearance-based Pseudo-Classes |
Ted Lentsch et.al. |
2405.15688v1 |
null |
2024-05-23 |
PuzzleAvatar: Assembling 3D Avatars from Personal Albums |
Yuliang Xiu et.al. |
2405.14869v1 |
null |
2024-05-23 |
Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis |
Basile Van Hoorick et.al. |
2405.14868v1 |
null |
2024-05-23 |
Video Diffusion Models are Training-free Motion Interpreter and Controller |
Zeqi Xiao et.al. |
2405.14864v1 |
null |
2024-05-23 |
Synergistic Global-space Camera and Human Reconstruction from Videos |
Yizhou Zhao et.al. |
2405.14855v1 |
null |
2024-05-23 |
Domain Wall Magnetic Tunnel Junction Reliable Integrate and Fire Neuron |
Can Cui1 et.al. |
2405.14851v1 |
null |
2024-05-23 |
Learning to Detect and Segment Mobile Objects from Unlabeled Videos |
Yihong Sun et.al. |
2405.14841v1 |
null |
2024-05-23 |
Designing A Sustainable Marine Debris Clean-up Framework without Human Labels |
Raymond Wang et.al. |
2405.14815v1 |
null |
2024-05-23 |
As an AI Language Model, "Yes I Would Recommend Calling the Police'': Norm Inconsistency in LLM Decision-Making |
Shomik Jain et.al. |
2405.14812v1 |
null |
2024-05-23 |
Lorentz-Equivariant Geometric Algebra Transformers for High-Energy Physics |
Jonas Spinner et.al. |
2405.14806v1 |
null |
2024-05-24 |
Fast-DDPM: Fast Denoising Diffusion Probabilistic Models for Medical Image-to-Image Generation |
Hongxu Jiang et.al. |
2405.14802v2 |
link |
2024-05-21 |
Comprehensive Multimodal Deep Learning Survival Prediction Enabled by a Transformer Architecture: A Multicenter Study in Glioblastoma |
Ahmed Gomaa et.al. |
2405.12963v1 |
null |
2024-05-21 |
Online Learning of Halfspaces with Massart Noise |
Ilias Diakonikolas et.al. |
2405.12958v1 |
null |
2024-05-21 |
Quantifying Uncertainty in Classification Performance: ROC Confidence Bands Using Conformal Prediction |
Zheshi Zheng et.al. |
2405.12953v1 |
null |
2024-05-21 |
Tutorly: Turning Programming Videos Into Apprenticeship Learning Environments with LLMs |
Wengxi Li et.al. |
2405.12946v1 |
null |
2024-05-21 |
Pytorch-Wildlife: A Collaborative Deep Learning Framework for Conservation |
Andres Hernandez et.al. |
2405.12930v1 |
link |
2024-05-21 |
Streamlining Software Reviews: Efficient Predictive Modeling with Minimal Examples |
Tim Menzies et.al. |
2405.12920v1 |
null |
2024-05-21 |
The $L_p$-dual space of a semisimple Lie group |
Bachir Bekka et.al. |
2405.12919v1 |
null |
2024-05-21 |
Topic Modelling Case Law Using a Large Language Model and a New Taxonomy for UK Law: AI Insights into Summary Judgment |
Holli Sargeant et.al. |
2405.12910v1 |
link |
2024-05-21 |
Decentralized Federated Learning Over Imperfect Communication Channels |
Weicai Li et.al. |
2405.12894v1 |
null |
2024-05-21 |
Investigating Persuasion Techniques in Arabic: An Empirical Study Leveraging Large Language Models |
Abdurahmman Alzahrani et.al. |
2405.12884v1 |
null |
2024-05-20 |
Images that Sound: Composing Images and Sounds on a Single Canvas |
Ziyang Chen et.al. |
2405.12221v1 |
null |
2024-05-20 |
Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices |
Nathaniel Cohen et.al. |
2405.12211v1 |
null |
2024-05-20 |
The sign of scalar curvature on Kähler blowups |
Garrett M. Brown et.al. |
2405.12189v1 |
null |
2024-05-20 |
Building Temporal Kernels with Orthogonal Polynomials |
Yan Ru Pei et.al. |
2405.12179v1 |
link |
2024-05-20 |
Wireless vs. Traditional Ultrasound Assessed Knee Cartilage Outcomes Utilizing Automated Gain and Normalization Techniques |
Arjun Parmar et.al. |
2405.12172v1 |
null |
2024-05-20 |
DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM |
Xuchen Li et.al. |
2405.12139v1 |
null |
2024-05-20 |
Alzheimer's Magnetic Resonance Imaging Classification Using Deep and Meta-Learning Models |
Nida Nasir et.al. |
2405.12126v1 |
null |
2024-05-20 |
An Active Learning Framework with a Class Balancing Strategy for Time Series Classification |
Shemonto Das et.al. |
2405.12122v1 |
null |
2024-05-20 |
AGNfitter-rx: Modelling the radio-to-X-ray SEDs of AGNs |
L. N. Martínez-Ramírez et.al. |
2405.12111v1 |
null |
2024-05-20 |
Real topological phonons in 3D carbon allotropes |
Xiaotian Wang et.al. |
2405.12072v1 |
null |
2024-05-17 |
Submodular Information Selection for Hypothesis Testing with Misclassification Penalties |
Jayanth Bhargav et.al. |
2405.10930v1 |
null |
2024-05-17 |
A Versatile Framework for Analyzing Galaxy Image Data by Implanting Human-in-the-loop on a Large Vision Model |
Mingxiang Fu et.al. |
2405.10890v1 |
null |
2024-05-17 |
Multicenter Privacy-Preserving Model Training for Deep Learning Brain Metastases Autosegmentation |
Yixing Huang et.al. |
2405.10870v1 |
null |
2024-05-17 |
"Hall" transport of liquid crystal solitons in Couette flow |
Rodrigo C. V. Coelho et.al. |
2405.10850v1 |
null |
2024-05-17 |
Automatic segmentation of Organs at Risk in Head and Neck cancer patients from CT and MRI scans |
Sébastien Quetin et.al. |
2405.10833v1 |
null |
2024-05-17 |
Open-Vocabulary Spatio-Temporal Action Detection |
Tao Wu et.al. |
2405.10832v1 |
null |
2024-05-17 |
Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities |
Hao Zhou et.al. |
2405.10825v1 |
null |
2024-05-17 |
ActiveLLM: Large Language Model-based Active Learning for Textual Few-Shot Scenarios |
Markus Bayer et.al. |
2405.10808v1 |
null |
2024-05-17 |
A Large-scale Multi Domain Leukemia Dataset for the White Blood Cells Detection with Morphological Attributes for Explainability |
Abdul Rehman et.al. |
2405.10803v1 |
null |
2024-05-17 |
Reduced storage direct tensor ring decomposition for convolutional neural networks compression |
Mateusz Gabor et.al. |
2405.10802v1 |
link |
2024-05-16 |
TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction |
Yunfan Jiang et.al. |
2405.10315v1 |
null |
2024-05-16 |
4D Panoptic Scene Graph Generation |
Jingkang Yang et.al. |
2405.10305v1 |
link |
2024-05-16 |
On Sample Selection for Continual Learning: a Video Streaming Case Study |
Alexander Dietmüller et.al. |
2405.10290v1 |
null |
2024-05-16 |
Quantum Vision Transformers for Quark-Gluon Classification |
Marçal Comajoan Cara et.al. |
2405.10284v1 |
null |
2024-05-16 |
Faces that Speak: Jointly Synthesising Talking Face and Speech from Text |
Youngjoon Jang et.al. |
2405.10272v1 |
null |
2024-05-16 |
A Tale of Two Languages: Large-Vocabulary Continuous Sign Language Recognition from Spoken Language Supervision |
Charles Raude et.al. |
2405.10266v1 |
null |
2024-05-16 |
PRISM: A Multi-Modal Generative Foundation Model for Slide-Level Histopathology |
George Shaikovski et.al. |
2405.10254v1 |
null |
2024-05-16 |
A Foundation Model for Brain Lesion Segmentation with Mixture of Modality Experts |
Xinru Zhang et.al. |
2405.10246v1 |
null |
2024-05-16 |
Ternary mappings of some evolution algebras |
Candido Martin Gonzalez et.al. |
2405.10241v1 |
null |
2024-05-16 |
ENADPool: The Edge-Node Attention-based Differentiable Pooling for Graph Neural Networks |
Zhehan Zhao et.al. |
2405.10218v1 |
null |
2024-05-15 |
Classifying geospatial objects from multiview aerial imagery using semantic meshes |
David Russell et.al. |
2405.09544v1 |
null |
2024-05-15 |
Spectral complexity of deep neural networks |
Simmaco Di Lillo et.al. |
2405.09541v1 |
null |
2024-05-16 |
MMFusion: Multi-modality Diffusion Model for Lymph Node Metastasis Diagnosis in Esophageal Cancer |
Chengyu Wu et.al. |
2405.09539v2 |
link |
2024-05-15 |
Restoring balance: principled under/oversampling of data for optimal classification |
Emanuele Loffredo et.al. |
2405.09535v1 |
null |
2024-05-15 |
Tackling Distribution Shifts in Task-Oriented Communication with Information Bottleneck |
Hongru Li et.al. |
2405.09514v1 |
null |
2024-05-15 |
Beyond Flesch-Kincaid: Prompt-based Metrics Improve Difficulty Classification of Educational Texts |
Donya Rooein et.al. |
2405.09482v1 |
null |
2024-05-15 |
Perception- and Fidelity-aware Reduced-Reference Super-Resolution Image Quality Assessment |
Xinying Lin et.al. |
2405.09472v1 |
null |
2024-05-15 |
Non-contact Lung Disease Classification via OFDM-based Passive 6G ISAC Sensing |
Hasan Mujtaba Buttar et.al. |
2405.09458v1 |
null |
2024-05-15 |
Cohomogeneity one RCD-spaces |
Diego Corro et.al. |
2405.09448v1 |
null |
2024-05-15 |
M$^4$oE: A Foundation Model for Medical Multimodal Image Segmentation with Mixture of Experts |
Yufeng Jiang et.al. |
2405.09446v1 |
null |
2024-05-14 |
CinePile: A Long Video Question Answering Dataset and Benchmark |
Ruchit Rawal et.al. |
2405.08813v1 |
null |
2024-05-14 |
The Developing Human Connectome Project: A Fast Deep Learning-based Pipeline for Neonatal Cortical Surface Reconstruction |
Qiang Ma et.al. |
2405.08783v1 |
null |
2024-05-14 |
Harnessing the power of longitudinal medical imaging for eye disease prognosis using Transformer-based sequence modeling |
Gregory Holste et.al. |
2405.08780v1 |
null |
2024-05-14 |
FolkTalent: Enhancing Classification and Tagging of Indian Folk Paintings |
Nancy Hada et.al. |
2405.08776v1 |
null |
2024-05-14 |
From Text to Context: An Entailment Approach for News Stakeholder Classification |
Alapan Kuila et.al. |
2405.08751v1 |
null |
2024-05-14 |
Enhancing Blind Video Quality Assessment with Rich Quality-aware Features |
Wei Sun et.al. |
2405.08745v1 |
null |
2024-05-14 |
The impact of Compositionality in Zero-shot Multi-label action recognition for Object-based tasks |
Carmela Calabrese et.al. |
2405.08695v1 |
null |
2024-05-14 |
Latent group structure in linear panel data models with endogenous regressors |
Junho Choi et.al. |
2405.08687v1 |
null |
2024-05-14 |
Achieving Fairness Through Channel Pruning for Dermatological Disease Diagnosis |
Qingpeng Kong et.al. |
2405.08681v1 |
link |
2024-05-14 |
Investigating Design Choices in Joint-Embedding Predictive Architectures for General Audio Representation Learning |
Alain Riou et.al. |
2405.08679v1 |
null |
2024-05-14 |
MambaOut: Do We Really Need Mamba for Vision? |
Weihao Yu et.al. |
2405.07992v2 |
link |
2024-05-13 |
SPIN: Simultaneous Perception, Interaction and Navigation |
Shagun Uppal et.al. |
2405.07991v1 |
null |
2024-05-13 |
KG-Planner: Knowledge-Informed Graph Neural Planning for Collaborative Manipulators |
Wansong Liu et.al. |
2405.07962v1 |
null |
2024-05-13 |
An Algorithmic Classification of Generalized Pseudo-Anosov Homeomorphisms via Geometric Markov Partitions |
Inti Cruz Diaz et.al. |
2405.07954v1 |
null |
2024-05-13 |
Scene Action Maps: Behavioural Maps for Navigation without Metric Information |
Joel Loo et.al. |
2405.07948v1 |
null |
2024-05-14 |
PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition |
Ziyang Zhang et.al. |
2405.07932v2 |
link |
2024-05-13 |
Improving Multimodal Learning with Multi-Loss Gradient Modulation |
Konstantinos Kontras et.al. |
2405.07930v1 |
null |
2024-05-13 |
PLUTO: Pathology-Universal Transformer |
Dinkar Juyal et.al. |
2405.07905v1 |
null |
2024-05-13 |
Enhancing Clinically Significant Prostate Cancer Prediction in T2-weighted Images through Transfer Learning from Breast Cancer |
Chi-en Amy Tai et.al. |
2405.07869v1 |
null |
2024-05-13 |
Improving Breast Cancer Grade Prediction with Multiparametric MRI Created Using Optimized Synthetic Correlated Diffusion Imaging |
Chi-en Amy Tai et.al. |
2405.07861v1 |
null |
2024-05-10 |
Multi-Object Tracking in the Dark |
Xinzhe Wang et.al. |
2405.06600v1 |
link |
2024-05-10 |
Ice phase classification made easy with score-based denoising |
Hong Sun et.al. |
2405.06599v1 |
null |
2024-05-10 |
Enhancing Weakly Supervised Semantic Segmentation with Multi-modal Foundation Models: An End-to-End Approach |
Elham Ravanbakhsh et.al. |
2405.06586v1 |
null |
2024-05-10 |
Deep video representation learning: a survey |
Elham Ravanbakhsh et.al. |
2405.06574v1 |
null |
2024-05-10 |
The Role of Topological Photon Spheres in Constraining the Parameters of Black Holes |
Jafar Sadeghi et.al. |
2405.06568v1 |
null |
2024-05-10 |
OneTo3D: One Image to Re-editable Dynamic 3D Model and Video Generation |
Jinwei Lin et.al. |
2405.06547v1 |
link |
2024-05-10 |
Separating States in Astronomical Sources Using Hidden Markov Models: With a Case Study of Flaring and Quiescence on EV Lac |
Robert Zimmerman et.al. |
2405.06540v1 |
null |
2024-05-10 |
Semantic and Spatial Adaptive Pixel-level Classifier for Semantic Segmentation |
Xiaowen Ma et.al. |
2405.06525v1 |
link |
2024-05-10 |
Aspect-based Sentiment Evaluation of Chess Moves (ASSESS): an NLP-based Method for Evaluating Chess Strategies from Textbooks |
Haifa Alrdahi et.al. |
2405.06499v1 |
null |
2024-05-10 |
Improving Deep Learning Model Calibration for Cardiac Applications using Deterministic Uncertainty Networks and Uncertainty-aware Training |
Tareen Dawood et.al. |
2405.06487v1 |
null |
2024-05-09 |
A Universal Growth Rate for Learning with Smooth Surrogate Losses |
Anqi Mao et.al. |
2405.05968v1 |
null |
2024-05-09 |
Self-Supervised Learning of Time Series Representation via Diffusion Process and Imputation-Interpolation-Forecasting Mask |
Zineb Senane et.al. |
2405.05959v1 |
link |
2024-05-09 |
Frame Interpolation with Consecutive Brownian Bridge Diffusion |
Zonglin Lyu et.al. |
2405.05953v1 |
null |
2024-05-09 |
Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers |
Peng Gao et.al. |
2405.05945v1 |
link |
2024-05-09 |
MRISegmentator-Abdomen: A Fully Automated Multi-Organ and Structure Segmentation Tool for T1-weighted Abdominal MRI |
Yan Zhuang et.al. |
2405.05944v1 |
null |
2024-05-09 |
Non-symplectic automorphisms of prime order of O'Grady's tenfolds and cubic fourfolds |
Simone Billi et.al. |
2405.05932v1 |
null |
2024-05-09 |
Deep Multi-Task Learning for Malware Image Classification |
Ahmed Bensaoud et.al. |
2405.05906v1 |
null |
2024-05-09 |
An RNN-policy gradient approach for quantum architecture search |
Gang Wang et.al. |
2405.05892v1 |
null |
2024-05-09 |
Composable Part-Based Manipulation |
Weiyu Liu et.al. |
2405.05876v1 |
null |
2024-05-09 |
ExACT: An End-to-End Autonomous Excavator System Using Action Chunking With Transformers |
Liangliang Chen et.al. |
2405.05861v1 |
null |
2024-05-08 |
Diffusion-HMC: Parameter Inference with Diffusion Model driven Hamiltonian Monte Carlo |
Nayantara Mudur et.al. |
2405.05255v1 |
link |
2024-05-08 |
Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models |
Hongjie Wang et.al. |
2405.05252v1 |
null |
2024-05-08 |
DanceCam: atmospheric turbulence mitigation in wide-field astronomical images with short-exposure video streams |
Spencer Bialek et.al. |
2405.05250v1 |
null |
2024-05-08 |
Deep learning-based variational autoencoder for classification of quantum and classical states of light |
Mahesh Bhupati et.al. |
2405.05243v1 |
null |
2024-05-08 |
On $\operatorname{Alt}(n)$-modules with an additive dimension when $n\le6$ |
Barry Chin et.al. |
2405.05230v1 |
null |
2024-05-08 |
Are Economically Advanced Countries More Efficient in Basic and Applied Research? |
Vladimír Holý et.al. |
2405.05227v1 |
null |
2024-05-08 |
Clustering Retail Products Based on Customer Behaviour |
Vladimír Holý et.al. |
2405.05218v1 |
null |
2024-05-08 |
FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models |
Jinglin Xu et.al. |
2405.05216v1 |
link |
2024-05-08 |
Graded Relevance Scoring of Written Essays with Dense Retrieval |
Salam Albatarni et.al. |
2405.05200v1 |
null |
2024-05-08 |
Is Transductive Learning Equivalent to PAC Learning? |
Shaddin Dughmi et.al. |
2405.05190v1 |
null |
2024-05-07 |
Switchable Decision: Dynamic Neural Generation Networks |
Shujian Zhang et.al. |
2405.04513v1 |
null |
2024-05-07 |
Edit-Your-Motion: Space-Time Diffusion Decoupling Learning for Video Motion Editing |
Yi Zuo et.al. |
2405.04496v1 |
null |
2024-05-07 |
Exploration of Novel Neuromorphic Methodologies for Materials Applications |
Derek Gobin et.al. |
2405.04478v1 |
null |
2024-05-07 |
Generalized classical Yang-Baxter equation and regular decompositions |
Raschid Abedin et.al. |
2405.04440v1 |
null |
2024-05-07 |
On the classification of product-quotient surfaces with $q=0$, $p_g=3$ and their canonical map |
Federico Fallucca et.al. |
2405.04425v1 |
null |
2024-05-07 |
Vision Mamba: A Comprehensive Survey and Taxonomy |
Xiao Liu et.al. |
2405.04404v1 |
link |
2024-05-07 |
Efficient Online Set-valued Classification with Bandit Feedback |
Zhou Wang et.al. |
2405.04393v1 |
null |
2024-05-07 |
DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving |
Chen Min et.al. |
2405.04390v1 |
null |
2024-05-07 |
Parallelized Multi-Agent Bayesian Optimization in Lava |
Shay Snyder et.al. |
2405.04387v1 |
null |
2024-05-07 |
Pragmatist Intelligence: Where the Principle of Usefulness Can Take ANNs |
Antonio Bikić et.al. |
2405.04386v1 |
null |
2024-05-06 |
Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs |
Muhammad Uzair Khattak et.al. |
2405.03690v1 |
null |
2024-05-06 |
All-in-One Deep Learning Framework for MR Image Reconstruction |
Geunu Jeong et.al. |
2405.03684v1 |
null |
2024-05-06 |
ScrewMimic: Bimanual Imitation from Human Videos with Screw Space Projection |
Arpit Bahety et.al. |
2405.03666v1 |
null |
2024-05-06 |
CICA: Content-Injected Contrastive Alignment for Zero-Shot Document Image Classification |
Sankalp Sinha et.al. |
2405.03660v1 |
null |
2024-05-06 |
Collecting Consistently High Quality Object Tracks with Minimal Human Involvement by Using Self-Supervised Learning to Detect Tracker Errors |
Samreen Anjum et.al. |
2405.03643v1 |
null |
2024-05-06 |
Classification of Breast Cancer Histopathology Images using a Modified Supervised Contrastive Learning Method |
Matina Mahdizadeh Sani et.al. |
2405.03642v1 |
link |
2024-05-06 |
Nonequilibrium relaxation and odd-even effect in finite-temperature electron gases |
Eric Nilsson et.al. |
2405.03635v1 |
null |
2024-05-06 |
Nonnegative Matrix Factorization in Dimensionality Reduction: A Survey |
Farid Saberi-Movahed et.al. |
2405.03615v1 |
null |
2024-05-06 |
Dual Relation Mining Network for Zero-Shot Learning |
Jinwei Han et.al. |
2405.03613v1 |
null |
2024-05-06 |
Communities for the Lagrangian Dynamics of the Turbulent Velocity Gradient Tensor: A Network Participation Approach |
Christopher J. Keylock et.al. |
2405.03589v1 |
null |
2024-05-03 |
DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos |
Wen-Hsuan Chu et.al. |
2405.02280v1 |
null |
2024-05-03 |
Transversely Projective Structures on Smooth Foliations on Surfaces |
Gabriel Fazoli et.al. |
2405.02273v1 |
null |
2024-05-03 |
On its way to the neutron star-white dwarf binary graveyard, IGR J16194-2810, a first ascent M giant X-ray binary |
K. H. Hinkle et.al. |
2405.02270v1 |
null |
2024-05-03 |
Validating Gaia DR3 Pulsating Variable Classifications with TESS: Building Reliable $δ$ Scuti and $γ$ Doradus Stars Catalogs (In Progress) |
Ai-Ying Zhou et.al. |
2405.02264v1 |
null |
2024-05-03 |
Subgraph2vec: A random walk-based algorithm for embedding knowledge graphs |
Elika Bozorgi et.al. |
2405.02240v1 |
null |
2024-05-03 |
Fair Risk Control: A Generalized Framework for Calibrating Multi-group Fairness Risks |
Lujing Zhang et.al. |
2405.02225v1 |
null |
2024-05-03 |
Designed Dithering Sign Activation for Binary Neural Networks |
Brayan Monroy et.al. |
2405.02220v1 |
null |
2024-05-03 |
Multispectral Fine-Grained Classification of Blackgrass in Wheat and Barley Crops |
Madeleine Darbyshire et.al. |
2405.02218v1 |
null |
2024-05-03 |
Non-Destructive Peat Analysis using Hyperspectral Imaging and Machine Learning |
Yijun Yan et.al. |
2405.02191v1 |
null |
2024-05-03 |
Hoaxpedia: A Unified Wikipedia Hoax Articles Dataset |
Hsuvas Borkakoty et.al. |
2405.02175v1 |
null |
2024-05-02 |
Confronting sparse Gaia DR3 photometry with TESS for a sample of about 60,000 hot massive non-radial pulsators |
Daniel Hey et.al. |
2405.01539v1 |
null |
2024-05-02 |
Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks |
Murtaza Dalal et.al. |
2405.01534v1 |
null |
2024-05-02 |
Improving Intervention Efficacy via Concept Realignment in Concept Bottleneck Models |
Nishad Singhi et.al. |
2405.01531v1 |
null |
2024-05-02 |
Track2Act: Predicting Point Tracks from Internet Videos enables Diverse Zero-shot Robot Manipulation |
Homanga Bharadhwaj et.al. |
2405.01527v1 |
null |
2024-05-03 |
A separability-based approach to quantifying generalization: which layer is best? |
Luciano Dyballa et.al. |
2405.01524v2 |
null |
2024-05-02 |
Grand Design vs. Multi-Armed Spiral Galaxies: Dependence on Galaxy Structure |
Beverly J. Smith et.al. |
2405.01516v1 |
null |
2024-05-03 |
Accelerating Convergence in Bayesian Few-Shot Classification |
Tianjun Ke et.al. |
2405.01507v2 |
link |
2024-05-02 |
PAM-UNet: Shifting Attention on Region of Interest in Medical Images |
Abhijit Das et.al. |
2405.01503v1 |
null |
2024-05-02 |
Exploring Privacy Issues in Mission Critical Communication: Navigating 5G and Beyond Networks |
Prajnamaya Dass et.al. |
2405.01492v1 |
null |
2024-05-02 |
Designing Algorithmic Recommendations to Achieve Human-AI Complementarity |
Bryce McLaughlin et.al. |
2405.01484v1 |
null |
2024-05-01 |
Quantum algorithms for matrix geometric means |
Nana Liu et.al. |
2405.00673v1 |
null |
2024-05-01 |
Adapting Pretrained Networks for Image Quality Assessment on High Dynamic Range Displays |
Andrei Chubarau et.al. |
2405.00670v1 |
null |
2024-05-01 |
Screening of BindingDB database ligands against EGFR, HER2, Estrogen, Progesterone and NF-kB receptors based on machine learning and molecular docking |
Parham Rezaee et.al. |
2405.00647v1 |
null |
2024-05-01 |
Addressing Topic Granularity and Hallucination in Large Language Models for Topic Modelling |
Yida Mu et.al. |
2405.00611v1 |
null |
2024-05-01 |
Investigating Automatic Scoring and Feedback using Large Language Models |
Gloria Ashiya Katuka et.al. |
2405.00602v1 |
null |
2024-05-01 |
Discovering robust biomarkers of neurological disorders from functional MRI using graph neural networks: A Review |
Yi Hao Chan et.al. |
2405.00577v1 |
null |
2024-05-01 |
EALD-MLLM: Emotion Analysis in Long-sequential and De-identity videos with Multi-modal Large Language Model |
Deng Li et.al. |
2405.00574v1 |
null |
2024-05-01 |
Remote Sensing Data Assimilation with a Chained Hydrologic-hydraulic Model for Flood Forecasting |
Thanh Huy Nguyen et.al. |
2405.00567v1 |
null |
2024-05-01 |
Digital-analog quantum convolutional neural networks for image classification |
Anton Simen et.al. |
2405.00548v1 |
null |
2024-05-01 |
UWAFA-GAN: Ultra-Wide-Angle Fluorescein Angiography Transformation via Multi-scale Generation and Registration Enhancement |
Ruiquan Ge et.al. |
2405.00542v1 |
link |
2024-04-30 |
A Framework for Leveraging Human Computation Gaming to Enhance Knowledge Graphs for Accuracy Critical Generative AI Applications |
Steph Buongiorno et.al. |
2404.19729v1 |
null |
2024-04-30 |
Classification of simple 0-dimensional isolated complete intersection singularities |
Thuy Huong Pham et.al. |
2404.19728v1 |
null |
2024-04-30 |
PACER+: On-Demand Pedestrian Animation Controller in Driving Scenarios |
Jingbo Wang et.al. |
2404.19722v1 |
null |
2024-04-30 |
PANGeA: Procedural Artificial Narrative using Generative AI for Turn-Based Video Games |
Steph Buongiorno et.al. |
2404.19721v1 |
null |
2024-04-30 |
ThangDLU at #SMM4H 2024: Encoder-decoder models for classifying text data on social disorders in children and adolescents |
Hoang-Thang Ta et.al. |
2404.19714v1 |
null |
2024-04-30 |
A rank decomposition for the topological classification of neural representations |
Kosio Beshkov et.al. |
2404.19710v1 |
null |
2024-04-30 |
Neural Controlled Differential Equations with Quantum Hidden Evolutions |
Lingyi Yang et.al. |
2404.19673v1 |
link |
2024-04-30 |
Beyond MOS: Subjective Image Quality Score Preprocessing Method Based on Perceptual Similarity |
Lei Wang et.al. |
2404.19666v1 |
null |
2024-04-30 |
Towards Generalist Robot Learning from Internet Video: A Survey |
Robert McCarthy et.al. |
2404.19664v1 |
null |
2024-04-30 |
Regularization of Riemannian optimization: Application to process tomography and quantum machine learning |
Felix Soest et.al. |
2404.19659v1 |
null |
2024-04-29 |
Hallucination of Multimodal Large Language Models: A Survey |
Zechen Bai et.al. |
2404.18930v1 |
link |
2024-04-29 |
Swin2-MoSE: A New Single Image Super-Resolution Model for Remote Sensing |
Leonardo Rossi et.al. |
2404.18924v1 |
null |
2024-04-29 |
Anomaly and invertible field theory with higher-form symmetry: Extended group cohomology |
Shi Chen et.al. |
2404.18921v1 |
null |
2024-04-29 |
A Survey on Diffusion Models for Time Series and Spatio-Temporal Data |
Yiyuan Yang et.al. |
2404.18886v1 |
link |
2024-04-29 |
A Multilevel Strategy to Improve People Tracking in a Real-World Scenario |
Cristiano B. de Oliveira et.al. |
2404.18876v1 |
null |
2024-04-29 |
A Survey on Vision Mamba: Models, Applications and Challenges |
Rui Xu et.al. |
2404.18861v1 |
link |
2024-04-29 |
ConPro: Learning Severity Representation for Medical Images using Contrastive Learning and Preference Optimization |
Hong Nguyen et.al. |
2404.18831v1 |
link |
2024-04-29 |
Towards Extreme Image Compression with Latent Feature Guidance and Diffusion Prior |
Zhiyuan Li et.al. |
2404.18820v1 |
null |
2024-04-29 |
Certification of Speaker Recognition Models to Additive Perturbations |
Dmitrii Korzh et.al. |
2404.18791v1 |
null |
2024-04-29 |
Understanding Radicals via Orbital Parities |
Reza G. Shirazi et.al. |
2404.18787v1 |
null |
2024-04-26 |
Tunnel Try-on: Excavating Spatial-temporal Tunnels for High-quality Virtual Try-on in Videos |
Zhengze Xu et.al. |
2404.17571v1 |
null |
2024-04-26 |
Multifold topological semimetals |
Iñigo Robredo et.al. |
2404.17539v1 |
null |
2024-04-26 |
Exploring the Distinctiveness and Fidelity of the Descriptions Generated by Large Vision-Language Models |
Yuhang Huang et.al. |
2404.17534v1 |
null |
2024-04-26 |
Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations |
Puhao Li et.al. |
2404.17521v1 |
link |
2024-04-26 |
Learning text-to-video retrieval from image captioning |
Lucas Ventura et.al. |
2404.17498v1 |
null |
2024-04-26 |
Tabular Data Contrastive Learning via Class-Conditioned and Feature-Correlation Based Augmentation |
Wei Cui et.al. |
2404.17489v1 |
link |
2024-04-26 |
Low Cost Machine Vision for Insect Classification |
Danja Brandt et.al. |
2404.17488v1 |
null |
2024-04-26 |
Conformal Prediction with Learned Features |
Shayan Kiyani et.al. |
2404.17487v1 |
null |
2024-04-26 |
Sparse Reconstruction of Optical Doppler Tomography Based on State Space Model |
Zhenghong Li et.al. |
2404.17484v1 |
null |
2024-04-26 |
One-Shot Image Restoration |
Deborah Pereg et.al. |
2404.17426v1 |
null |
2024-04-25 |
Made to Order: Discovering monotonic temporal changes via self-supervised video ordering |
Charig Yang et.al. |
2404.16828v1 |
null |
2024-04-25 |
ResVR: Joint Rescaling and Viewport Rendering of Omnidirectional Images |
Weiqi Li et.al. |
2404.16825v1 |
null |
2024-04-25 |
V2A-Mark: Versatile Deep Visual-Audio Watermarking for Manipulation Localization and Copyright Protection |
Xuanyu Zhang et.al. |
2404.16824v1 |
null |
2024-04-25 |
Learning Visuotactile Skills with Two Multifingered Hands |
Toru Lin et.al. |
2404.16823v1 |
link |
2024-04-25 |
Meta-Transfer Derm-Diagnosis: Exploring Few-Shot Learning and Transfer Learning for Skin Disease Classification in Long-Tail Distribution |
Zeynep Özdemir et.al. |
2404.16814v1 |
null |
2024-04-25 |
Transformer-Based Local Feature Matching for Multimodal Image Registration |
Remi Delaunay et.al. |
2404.16802v1 |
null |
2024-04-25 |
DrS: Learning Reusable Dense Rewards for Multi-Stage Tasks |
Tongzhou Mu et.al. |
2404.16779v1 |
null |
2024-04-25 |
Modeling Selective Feature Attention for Representation-based Siamese Text Matching |
Jianxiang Zang et.al. |
2404.16776v1 |
link |
2024-04-25 |
Classifying One-Dimensional Quantum States Prepared by a Single Round of Measurements |
Rahul Sahay et.al. |
2404.16753v1 |
null |
2024-04-25 |
Characterizing Solar Center-to-Limb Radial-Velocity Variability with SDO |
Michael L. Palumbo III et.al. |
2404.16747v1 |
null |
2024-04-24 |
Optimizing OOD Detection in Molecular Graphs: A Novel Approach with Diffusion Models |
Xu Shen et.al. |
2404.15625v1 |
null |
2024-04-24 |
Layer Ensemble Averaging for Improving Memristor-Based Artificial Neural Network Performance |
Osama Yousuf et.al. |
2404.15621v1 |
null |
2024-04-24 |
A Dynamic Kernel Prior Model for Unsupervised Blind Image Super-Resolution |
Zhixiong Yang et.al. |
2404.15620v1 |
link |
2024-04-24 |
MDDD: Manifold-based Domain Adaptation with Dynamic Distribution for Non-Deep Transfer Learning in Cross-subject and Cross-session EEG-based Emotion Recognition |
Ting Luo et.al. |
2404.15615v1 |
null |
2024-04-24 |
Federated Learning with Only Positive Labels by Exploring Label Correlations |
Xuming An et.al. |
2404.15598v1 |
null |
2024-04-24 |
A Survey of Deep Long-Tail Classification Advancements |
Charika de Alvis et.al. |
2404.15593v1 |
null |
2024-04-24 |
Domain Adaptation for Learned Image Compression with Supervised Adapters |
Alberto Presta et.al. |
2404.15591v1 |
null |
2024-04-24 |
Brain Storm Optimization Based Swarm Learning for Diabetic Retinopathy Image Classification |
Liang Qu et.al. |
2404.15585v1 |
null |
2024-04-24 |
Research on OPF control of three-phase four-wire low-voltage distribution network considering uncertainty |
Rui Wang et.al. |
2404.15584v1 |
null |
2024-04-24 |
MiM: Mask in Mask Self-Supervised Pre-Training for 3D Medical Image Analysis |
Jiaxin Zhuang et.al. |
2404.15580v1 |
null |
2024-04-23 |
ID-Animator: Zero-Shot Identity-Preserving Human Video Generation |
Xuanhua He et.al. |
2404.15275v1 |
link |
2024-04-23 |
Metric-guided Image Reconstruction Bounds via Conformal Prediction |
Matt Y Cheung et.al. |
2404.15274v1 |
link |
2024-04-23 |
Quantum optical classifier with superexponential speedup |
Simone Roncallo et.al. |
2404.15266v1 |
null |
2024-04-23 |
TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting |
Jiahe Li et.al. |
2404.15264v1 |
null |
2024-04-23 |
Multi-Session SLAM with Differentiable Wide-Baseline Pose Optimization |
Lahav Lipson et.al. |
2404.15263v1 |
link |
2024-04-23 |
FlowMap: High-Quality Camera Poses, Intrinsics, and Depth via Gradient Descent |
Cameron Smith et.al. |
2404.15259v1 |
null |
2024-04-23 |
Source-free Domain Adaptation for Video Object Detection Under Adverse Image Conditions |
Xingguang Zhang et.al. |
2404.15252v1 |
null |
2024-04-23 |
Unifying the Temperature Dependent Dynamics of Glasses |
Joseph B. Schlenoff et.al. |
2404.15250v1 |
null |
2024-04-23 |
Mining Invariance from Nonlinear Multi-Environment Data: Binary Classification |
Austin Goddard et.al. |
2404.15245v1 |
null |
2024-04-23 |
Revisiting Unnaturalness for Automated Program Repair in the Era of Large Language Models |
Aidan Z. H. Yang et.al. |
2404.15236v1 |
null |
2024-04-22 |
AutoAD III: The Prequel -- Back to the Pixels |
Tengda Han et.al. |
2404.14412v1 |
null |
2024-04-22 |
Guess The Unseen: Dynamic 3D Scene Reconstruction from Partial 2D Glimpses |
Inhee Lee et.al. |
2404.14410v1 |
null |
2024-04-22 |
Hyp-OC: Hyperbolic One Class Classification for Face Anti-Spoofing |
Kartik Narayan et.al. |
2404.14406v1 |
null |
2024-04-22 |
A mean curvature flow arising in adversarial training |
Leon Bungert et.al. |
2404.14402v1 |
null |
2024-04-22 |
TAVGBench: Benchmarking Text to Audible-Video Generation |
Yuxin Mao et.al. |
2404.14381v1 |
link |
2024-04-22 |
Rethinking Legal Compliance Automation: Opportunities with Large Language Models |
Shabnam Hassani et.al. |
2404.14356v1 |
null |
2024-04-22 |
On-the-Fly Point Annotation for Fast Medical Video Labeling |
Meyer Adrien et.al. |
2404.14344v1 |
null |
2024-04-22 |
X-Ray: A Sequential 3D Representation for Generation |
Tao Hu et.al. |
2404.14329v1 |
null |
2024-04-22 |
A Novel Approach to Chest X-ray Lung Segmentation Using U-net and Modified Convolutional Block Attention Module |
Mohammad Ali Labbaf Khaniki et.al. |
2404.14322v1 |
null |
2024-04-22 |
"I Upload...All Types of Different Things to Say, the World of Blindness Is More Than What They Think It Is": A Study of Blind TikTokers' Identity Work from a Flourishing Perspective |
Yao Lyu et.al. |
2404.14305v1 |
null |
2024-04-19 |
Data Alignment for Zero-Shot Concept Generation in Dermatology AI |
Soham Gadgil et.al. |
2404.13043v1 |
null |
2024-04-19 |
PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation |
Tianyuan Zhang et.al. |
2404.13026v1 |
null |
2024-04-19 |
BANF: Band-limited Neural Fields for Levels of Detail Reconstruction |
Ahan Shabanov et.al. |
2404.13024v1 |
null |
2024-04-19 |
Stronger Random Baselines for In-Context Learning |
Gregory Yauney et.al. |
2404.13020v1 |
link |
2024-04-19 |
A New Multi-Picture Architecture for Learned Video Deinterlacing and Demosaicing with Parallel Deformable Convolution and Self-Attention Blocks |
Ronglei Ji et.al. |
2404.13018v1 |
null |
2024-04-19 |
Towards Robust Ferrous Scrap Material Classification with Deep Learning and Conformal Prediction |
Paulo Henrique dos Santos et.al. |
2404.13002v1 |
null |
2024-04-19 |
RadRotator: 3D Rotation of Radiographs with Diffusion Models |
Pouria Rouzrokh et.al. |
2404.13000v1 |
null |
2024-04-19 |
Nuclei Instance Segmentation of Cryosectioned H&E Stained Histological Images using Triple U-Net Architecture |
Zarif Ahmed et.al. |
2404.12986v1 |
null |
2024-04-19 |
Cross-modal Diffusion Modelling for Super-resolved Spatial Transcriptomics |
Xiaofei Wang et.al. |
2404.12973v1 |
null |
2024-04-19 |
Improving Pediatric Pneumonia Diagnosis with Adult Chest X-ray Images Utilizing Contrastive Learning and Embedding Similarity |
Mohammad Zunaed et.al. |
2404.12958v1 |
null |
2024-04-18 |
On the Content Bias in Fréchet Video Distance |
Songwei Ge et.al. |
2404.12391v1 |
null |
2024-04-18 |
Moving Object Segmentation: All You Need Is SAM (and Flow) |
Junyu Xie et.al. |
2404.12389v1 |
null |
2024-04-18 |
VideoGigaGAN: Towards Detail-rich Video Super-Resolution |
Yiran Xu et.al. |
2404.12388v1 |
null |
2024-04-18 |
Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models |
Aitor Ormazabal et.al. |
2404.12387v1 |
null |
2024-04-18 |
G-HOP: Generative Hand-Object Prior for Interaction Reconstruction and Grasp Synthesis |
Yufei Ye et.al. |
2404.12383v1 |
null |
2024-04-18 |
Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Monocular Videos |
Isabella Liu et.al. |
2404.12379v1 |
null |
2024-04-18 |
RoboDreamer: Learning Compositional World Models for Robot Imagination |
Siyuan Zhou et.al. |
2404.12377v1 |
null |
2024-04-18 |
When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes |
Asaf Yehudai et.al. |
2404.12365v1 |
null |
2024-04-18 |
Inverse Neural Rendering for Explainable Multi-Object Tracking |
Julian Ost et.al. |
2404.12359v1 |
null |
2024-04-18 |
Improving the interpretability of GNN predictions through conformal-based graph sparsification |
Pablo Sanchez-Martin et.al. |
2404.12356v1 |
link |
2024-04-18 |
Dynamic Typography: Bringing Text to Life via Video Diffusion Prior |
Zichen Liu et.al. |
2404.11614v2 |
null |
2024-04-17 |
VG4D: Vision-Language Model Goes 4D Video Recognition |
Zhichao Deng et.al. |
2404.11605v1 |
link |
2024-04-17 |
Variational Bayesian Last Layers |
James Harrison et.al. |
2404.11599v1 |
link |
2024-04-17 |
State-space Decomposition Model for Video Prediction Considering Long-term Motion Trend |
Fei Cui et.al. |
2404.11576v1 |
null |
2024-04-17 |
Simple Image Signal Processing using Global Context Guidance |
Omar Elezabi et.al. |
2404.11569v1 |
link |
2024-04-17 |
Spatio-Temporal Motion Retargeting for Quadruped Robots |
Taerim Yoon et.al. |
2404.11557v1 |
null |
2024-04-17 |
Predicting Long-horizon Futures by Conditioning on Geometry and Time |
Tarasha Khurana et.al. |
2404.11554v1 |
null |
2024-04-17 |
Carbon- and Oxygen-rich stars in MaStar: identification and classification |
Lewis Hill et.al. |
2404.11541v1 |
null |
2024-04-17 |
GenFighter: A Generative and Evolutive Textual Attack Removal |
Md Athikul Islam et.al. |
2404.11538v1 |
null |
2024-04-17 |
SSDiff: Spatial-spectral Integrated Diffusion Model for Remote Sensing Pansharpening |
Yu Zhong et.al. |
2404.11537v1 |
null |
2024-04-16 |
COMBO: Compositional World Models for Embodied Multi-Agent Cooperation |
Hongxin Zhang et.al. |
2404.10775v1 |
null |
2024-04-16 |
RapidVol: Rapid Reconstruction of 3D Ultrasound Volumes from Sensorless 2D Scans |
Mark C. Eid et.al. |
2404.10766v1 |
null |
2024-04-16 |
Deep Learning and LLM-based Methods Applied to Stellar Lightcurve Classification |
Yu-Yang Li et.al. |
2404.10757v1 |
null |
2024-04-16 |
Integer-valued o-minimal functions |
Neer Bhardwaj et.al. |
2404.10737v1 |
null |
2024-04-16 |
Randomized Exploration in Cooperative Multi-Agent Reinforcement Learning |
Hao-Lun Hsu et.al. |
2404.10728v1 |
null |
2024-04-16 |
AV-GAN: Attention-Based Varifocal Generative Adversarial Network for Uneven Medical Image Translation |
Zexin Li et.al. |
2404.10714v1 |
null |
2024-04-17 |
Dual Modalities of Text: Visual and Textual Generative Pre-training |
Yekun Chai et.al. |
2404.10710v2 |
null |
2024-04-16 |
Question Difficulty Ranking for Multiple-Choice Reading Comprehension |
Vatsal Raina et.al. |
2404.10704v1 |
null |
2024-04-16 |
Retrieval Augmented Verification : Unveiling Disinformation with Structured Representations for Zero-Shot Real-Time Evidence-guided Fact-Checking of Multi-modal Social media posts |
Arka Ujjal Dey et.al. |
2404.10702v1 |
null |
2024-04-16 |
Rawformer: Unpaired Raw-to-Raw Translation for Learnable Camera ISPs |
Georgy Perevozchikov et.al. |
2404.10700v1 |
null |
2024-04-15 |
Squish Jamming |
Samuel Poincloux et.al. |
2404.09773v1 |
null |
2024-04-15 |
Hilti SLAM Challenge 2023: Benchmarking Single + Multi-session SLAM across Sensor Constellations in Construction |
Ashish Devadas Nair et.al. |
2404.09765v1 |
null |
2024-04-15 |
Deep Learning-Based Segmentation of Tumors in PET/CT Volumes: Benchmark of Different Architectures and Training Strategies |
Monika Górka et.al. |
2404.09761v1 |
null |
2024-04-15 |
Quantization of Large Language Models with an Overdetermined Basis |
Daniil Merkulov et.al. |
2404.09737v1 |
null |
2024-04-15 |
FSRT: Facial Scene Representation Transformer for Face Reenactment from Factorized Appearance, Head-pose, and Facial Expression Features |
Andre Rochow et.al. |
2404.09736v1 |
null |
2024-04-15 |
Classification of finite type fusion quivers |
Ben Elias et.al. |
2404.09714v1 |
null |
2024-04-15 |
LoRAP: Transformer Sub-Layers Deserve Differentiated Structured Compression for Large Language Models |
Guangyan Li et.al. |
2404.09695v1 |
null |
2024-04-15 |
Harnessing GPT-4V(ision) for Insurance: A Preliminary Exploration |
Chenwei Lin et.al. |
2404.09690v1 |
null |
2024-04-15 |
Post-Training Network Compression for 3D Medical Image Segmentation: Reducing Computational Efforts via Tucker Decomposition |
Tobias Weber et.al. |
2404.09683v1 |
link |
2024-04-15 |
Cluster analysis of the Roma-BZCAT blazars |
D. O. Kudryavtsev et.al. |
2404.09667v1 |
null |
2024-04-15 |
Deformable MRI Sequence Registration for AI-based Prostate Cancer Diagnosis |
Alessa Hering et.al. |
2404.09666v1 |
null |
2024-04-15 |
Closing the Gap in the Trade-off between Fair Representations and Accuracy |
Biswajit Rout et.al. |
2404.09664v1 |
null |
2024-04-15 |
If there's a Trigger Warning, then where's the Trigger? Investigating Trigger Warnings at the Passage Level |
Matti Wiegmann et.al. |
2404.09615v1 |
link |
2024-04-12 |
FCert: Certifiably Robust Few-Shot Classification in the Era of Foundation Models |
Yanting Wang et.al. |
2404.08631v1 |
null |
2024-04-12 |
Classification of Boolean Algebras through von Neumann regular $\mathcal{C}^{\infty}-$Rings |
Jean Cerqueira Berni et.al. |
2404.08629v1 |
null |
2024-04-12 |
Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation |
Yanhao Zheng et.al. |
2404.08603v1 |
link |
2024-04-12 |
Pathological Primitive Segmentation Based on Visual Foundation Model with Zero-Shot Mask Generation |
Abu Bakor Hayat Arnob et.al. |
2404.08584v1 |
link |
2024-04-12 |
Lossy Image Compression with Foundation Diffusion Models |
Lucas Relic et.al. |
2404.08580v1 |
null |
2024-04-12 |
IDD-X: A Multi-View Dataset for Ego-relative Important Object Localization and Explanation in Dense and Unstructured Traffic |
Chirag Parikh et.al. |
2404.08561v1 |
null |
2024-04-12 |
Scalability in Building Component Data Annotation: Enhancing Facade Material Classification with Synthetic Data |
Josie Harrison et.al. |
2404.08557v1 |
null |
2024-04-12 |
Benchmarking the Cell Image Segmentation Models Robustness under the Microscope Optical Aberrations |
Boyuan Peng et.al. |
2404.08549v1 |
null |
2024-04-12 |
VertAttack: Taking advantage of Text Classifiers' horizontal vision |
Jonathan Rusert et.al. |
2404.08538v1 |
null |
2024-04-12 |
Text Prompt with Normality Guidance for Weakly Supervised Video Anomaly Detection |
Zhiwei Yang et.al. |
2404.08531v1 |
null |
2024-04-11 |
Connecting NeRFs, Images, and Text |
Francesco Ballerini et.al. |
2404.07993v1 |
null |
2024-04-11 |
GoMAvatar: Efficient Animatable Human Modeling from Monocular Video Using Gaussians-on-Mesh |
Jing Wen et.al. |
2404.07991v1 |
null |
2024-04-11 |
WaveMo: Learning Wavefront Modulations to See Through Scattering |
Mingyang Xie et.al. |
2404.07985v1 |
null |
2024-04-11 |
Gaga: Group Any Gaussians via 3D-aware Memory Bank |
Weijie Lyu et.al. |
2404.07977v1 |
null |
2024-04-11 |
FusionMamba: Efficient Image Fusion with State Space Model |
Siran Peng et.al. |
2404.07932v1 |
null |
2024-04-11 |
HGRN2: Gated Linear RNNs with State Expansion |
Zhen Qin et.al. |
2404.07904v1 |
link |
2024-04-11 |
Q-ITAGS: Quality-Optimized Spatio-Temporal Heterogeneous Task Allocation with a Time Budget |
Glen Neville et.al. |
2404.07902v1 |
null |
2024-04-11 |
Auditing health-related recommendations in social media: A Case Study of Abortion on YouTube |
Mohammed Lahsaini et.al. |
2404.07896v1 |
null |
2024-04-11 |
Typical blocks of the category $\mathcal O$ and Whittaker modules for Takiff superalgebras |
Chih-Whi Chen et.al. |
2404.07894v1 |
null |
2024-04-11 |
Context-aware Video Anomaly Detection in Long-Term Datasets |
Zhengye Yang et.al. |
2404.07887v1 |
null |
2024-04-10 |
RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion |
Jaidev Shriram et.al. |
2404.07199v1 |
null |
2024-04-10 |
GCV-Turbo: End-to-end Acceleration of GNN-based Computer Vision Tasks on FPGA |
Bingyi Zhang et.al. |
2404.07188v1 |
null |
2024-04-10 |
Adinkras and Pure Spinors |
Richard Eager et.al. |
2404.07167v1 |
null |
2024-04-10 |
Lost in Translation: Modern Neural Networks Still Struggle With Small Realistic Image Transformations |
Ofir Shifman et.al. |
2404.07153v1 |
null |
2024-04-10 |
Learning of deep convolutional network image classifiers via stochastic gradient descent and over-parametrization |
Michael Kohler et.al. |
2404.07128v1 |
null |
2024-04-10 |
Measuring proximity to standard planes during fetal brain ultrasound scanning |
Chiara Di Vece et.al. |
2404.07124v1 |
null |
2024-04-10 |
"My toxic trait is thinking I'll remember this": gaps in the learner experience of video tutorials for feature-rich software |
Ian Drosos et.al. |
2404.07114v1 |
null |
2024-04-10 |
The generic dual of p-adic groups and applications |
Chris Jantzen et.al. |
2404.07111v1 |
null |
2024-04-10 |
Learning Priors for Non Rigid SfM from Casual Videos |
Yoni Kasten et.al. |
2404.07097v1 |
null |
2024-04-10 |
VLLMs Provide Better Context for Emotion Understanding Through Common Sense Reasoning |
Alexandros Xenos et.al. |
2404.07078v1 |
link |
2024-04-09 |
MoReVQA: Exploring Modular Reasoning Models for Video Question Answering |
Juhong Min et.al. |
2404.06511v1 |
null |
2024-04-10 |
Reconstructing Hand-Held Objects in 3D |
Jane Wu et.al. |
2404.06507v2 |
null |
2024-04-09 |
A Machine Learning Framework for the Prediction of Grain Boundary Segregation in Chemically Complex Environments |
Doruk Aksoy et.al. |
2404.06499v1 |
null |
2024-04-10 |
Flying with Photons: Rendering Novel Views of Propagating Light |
Anagh Malik et.al. |
2404.06493v2 |
null |
2024-04-09 |
Uncovering Tidal Treasures: Automated Classification of Faint Tidal Features in DECaLS Data |
Alexander J. Gordon et.al. |
2404.06487v1 |
null |
2024-04-09 |
RhythmMamba: Fast Remote Physiological Measurement with Arbitrary Length Videos |
Bochao Zou et.al. |
2404.06483v1 |
null |
2024-04-09 |
Laue Indexing with Optimal Transport |
Tomasz Kacprzak et.al. |
2404.06478v1 |
link |
2024-04-09 |
A comparative analysis of deep learning models for lung segmentation on X-ray images |
Weronika Hryniewska-Guzik et.al. |
2404.06455v1 |
link |
2024-04-09 |
QueSTMaps: Queryable Semantic Topological Maps for 3D Scene Understanding |
Yash Mehan et.al. |
2404.06442v1 |
null |
2024-04-09 |
ClassiPyGRB: Machine Learning-Based Classification and Visualization of Gamma Ray Bursts using t-SNE |
Keneth Garcia-Cifuentes et.al. |
2404.06439v1 |
null |
2024-04-08 |
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding |
Bo He et.al. |
2404.05726v1 |
null |
2024-04-08 |
Predicting Overtakes in Trucks Using CAN Data |
Talha Hanif Butt et.al. |
2404.05723v1 |
null |
2024-04-08 |
Case Study: Neural Network Malware Detection Verification for Feature and Image Datasets |
Preston K. Robinette et.al. |
2404.05703v1 |
null |
2024-04-08 |
Comprehensive Study on German Language Models for Clinical and Biomedical Text Understanding |
Ahmad Idrissi-Yaghir et.al. |
2404.05694v1 |
null |
2024-04-08 |
Evaluating the Efficacy of Cut-and-Paste Data Augmentation in Semantic Segmentation for Satellite Imagery |
Ionut M. Motoi et.al. |
2404.05693v1 |
null |
2024-04-08 |
AlignZeg: Mitigating Objective Misalignment for Zero-shot Semantic Segmentation |
Jiannan Ge et.al. |
2404.05667v1 |
null |
2024-04-08 |
Oblique photons, plasmons, and current-plasmons in relativistic plasmas and their topological implications |
Hong Qin et.al. |
2404.05636v1 |
null |
2024-04-08 |
AnchorAL: Computationally Efficient Active Learning for Large and Imbalanced Datasets |
Pietro Lesci et.al. |
2404.05623v1 |
null |
2024-04-08 |
Experimental observation of a time rondeau crystal: Temporal Disorder in Spatiotemporal Order |
Leo Joon Il Moon et.al. |
2404.05620v1 |
null |
2024-04-08 |
Self-Explainable Affordance Learning with Embodied Caption |
Zhipeng Zhang et.al. |
2404.05603v1 |
null |
2024-04-05 |
On classification of global dynamics for energy-critical equivariant harmonic map heat flows and radial nonlinear heat equation |
Kihyun Kim et.al. |
2404.04247v1 |
null |
2024-04-05 |
Evaluating Adversarial Robustness: A Comparison Of FGSM, Carlini-Wagner Attacks, And The Role of Distillation as Defense Mechanism |
Trilokesh Ranjan Sarkar et.al. |
2404.04245v1 |
null |
2024-04-05 |
player2vec: A Language Modeling Approach to Understand Player Behavior in Games |
Tianze Wang et.al. |
2404.04234v1 |
null |
2024-04-05 |
Deep-learning Segmentation of Small Volumes in CT images for Radiotherapy Treatment Planning |
Jianxin Zhou et.al. |
2404.04202v1 |
null |
2024-04-05 |
SCAResNet: A ResNet Variant Optimized for Tiny Object Detection in Transmission and Distribution Towers |
Weile Li et.al. |
2404.04179v1 |
link |
2024-04-05 |
Noisy Label Processing for Classification: A Survey |
Mengting Li et.al. |
2404.04159v1 |
null |
2024-04-05 |
Improving Detection in Aerial Images by Capturing Inter-Object Relationships |
Botao Ren et.al. |
2404.04140v1 |
null |
2024-04-05 |
Label Propagation for Zero-shot Classification with Vision-Language Models |
Vladan Stojnić et.al. |
2404.04072v1 |
link |
2024-04-05 |
VoicePilot: Harnessing LLMs as Speech Interfaces for Physically Assistive Robots |
Akhil Padmanabha et.al. |
2404.04066v1 |
null |
2024-04-05 |
Phase Binarization in Mutually Synchronized Bias Field-free Spin Hall Nano-oscillators for Reservoir Computing |
Sourabh Manna et.al. |
2404.04023v1 |
null |
2024-04-04 |
OW-VISCap: Open-World Video Instance Segmentation and Captioning |
Anwesa Choudhuri et.al. |
2404.03657v1 |
null |
2024-04-04 |
Decoupling Static and Hierarchical Motion Perception for Referring Video Segmentation |
Shuting He et.al. |
2404.03645v1 |
link |
2024-04-04 |
On the Efficiency of Convolutional Neural Networks |
Andrew Lavin et.al. |
2404.03617v1 |
null |
2024-04-04 |
Creator Hearts: Investigating the Impact Positive Signals from YouTube Creators in Shaping Comment Section Behavior |
Frederick Choi et.al. |
2404.03612v1 |
null |
2024-04-04 |
InsectMamba: Insect Pest Classification with State Space Model |
Qianning Wang et.al. |
2404.03611v1 |
null |
2024-04-04 |
DiffDet4SAR: Diffusion-based Aircraft Target Detection Network for SAR Images |
Zhou Jie et.al. |
2404.03595v1 |
link |
2024-04-04 |
Alzheimer's disease detection in PSG signals |
Lorena Gallego-Viñarás et.al. |
2404.03549v1 |
null |
2024-04-04 |
Towards Transcranial 3D Ultrasound Localization Microscopy of the Nonhuman Primate Brain |
Paul Xing et.al. |
2404.03547v1 |
null |
2024-04-04 |
Segmentation-Guided Knee Radiograph Generation using Conditional Diffusion Models |
Siyuan Mei et.al. |
2404.03541v1 |
null |
2024-04-05 |
A Methodology to Study the Impact of Spiking Neural Network Parameters considering Event-Based Automotive Data |
Iqra Bano et.al. |
2404.03493v2 |
null |
2024-04-03 |
LidarDM: Generative LiDAR Simulation in a Generated World |
Vlas Zyrianov et.al. |
2404.02903v1 |
null |
2024-04-03 |
Guarantees of confidentiality via Hammersley-Chapman-Robbins bounds |
Kamalika Chaudhuri et.al. |
2404.02866v1 |
link |
2024-04-03 |
Semisimple Algebras of Vector Fields on $\mathbb{C}^{3}$ |
Sajid Ali et.al. |
2404.02847v1 |
null |
2024-04-03 |
GPU-Accelerated RSF Level Set Evolution for Large-Scale Microvascular Segmentation |
Meher Niger et.al. |
2404.02813v1 |
null |
2024-04-03 |
Generative-Contrastive Heterogeneous Graph Neural Network |
Yu Wang et.al. |
2404.02810v1 |
null |
2024-04-03 |
FPT: Feature Prompt Tuning for Few-shot Readability Assessment |
Ziyang Wang et.al. |
2404.02772v1 |
link |
2024-04-03 |
DIBS: Enhancing Dense Video Captioning with Unlabeled Videos via Pseudo Boundary Enrichment and Online Refinement |
Hao Wu et.al. |
2404.02755v1 |
null |
2024-04-03 |
Terraced Compression Method with Automated Threshold Selection for Multidimensional Image Clustering of Heterogeneous Bodies |
Jiatong Li et.al. |
2404.02744v1 |
null |
2024-04-03 |
Event Camera Demosaicing via Swin Transformer and Pixel-focus Loss |
Yunfan Lu et.al. |
2404.02731v1 |
link |
2024-04-03 |
Unblind Text Inputs: Predicting Hint-text of Text Input in Mobile Apps via LLM |
Zhe Liu et.al. |
2404.02706v1 |
null |
2024-04-02 |
Diffusion$^2$: Dynamic 3D Content Generation via Score Composition of Orthogonal Diffusion Models |
Zeyu Yang et.al. |
2404.02148v1 |
link |
2024-04-02 |
Multiparametric quantification and visualization of liver fat using ultrasound |
Jihye Baek et.al. |
2404.02143v1 |
null |
2024-04-03 |
ResNet with Integrated Convolutional Block Attention Module for Ship Classification Using Transfer Learning on Optical Satellite Imagery |
Ryan Donghan Kwon et.al. |
2404.02135v2 |
null |
2024-04-02 |
ViTamin: Designing Scalable Vision Models in the Vision-Language Era |
Jienneg Chen et.al. |
2404.02132v1 |
link |
2024-04-02 |
ImageNot: A contrast with ImageNet preserves model rankings |
Olawale Salaudeen et.al. |
2404.02112v1 |
null |
2024-04-02 |
CameraCtrl: Enabling Camera Control for Text-to-Video Generation |
Hao He et.al. |
2404.02101v1 |
link |
2024-04-02 |
Explainability in JupyterLab and Beyond: Interactive XAI Systems for Integrated and Collaborative Workflows |
Grace Guo et.al. |
2404.02081v1 |
null |
2024-04-02 |
Multi-Level Label Correction by Distilling Proximate Patterns for Semi-supervised Semantic Segmentation |
Hui Xiao et.al. |
2404.02065v1 |
null |
2024-04-02 |
Long-context LLMs Struggle with Long In-context Learning |
Tianle Li et.al. |
2404.02060v1 |
link |
2024-04-02 |
Deconstructing In-Context Learning: Understanding Prompts via Corruption |
Namrata Shivagunde et.al. |
2404.02054v1 |
link |
2024-03-29 |
Learn "No" to Say "Yes" Better: Improving Vision-Language Models via Negations |
Jaisidh Singh et.al. |
2403.20312v1 |
link |
2024-03-29 |
Emotion-Anchored Contrastive Learning Framework for Emotion Recognition in Conversation |
Fangxu Yu et.al. |
2403.20289v1 |
link |
2024-03-29 |
Prototype-based Interpretable Breast Cancer Prediction Models: Analysis and Challenges |
Shreyasi Pathak et.al. |
2403.20260v1 |
null |
2024-03-29 |
Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions |
Runhao Zeng et.al. |
2403.20254v1 |
null |
2024-03-29 |
Latent Embedding Clustering for Occlusion Robust Head Pose Estimation |
José Celestino et.al. |
2403.20251v1 |
null |
2024-03-29 |
Long-Tailed Anomaly Detection with Learnable Class Names |
Chih-Hui Ho et.al. |
2403.20236v1 |
null |
2024-04-02 |
Artificial Neural Networks-based Real-time Classification of ENG Signals for Implanted Nerve Interfaces |
Antonio Coviello et.al. |
2403.20234v2 |
null |
2024-03-29 |
MTMMC: A Large-Scale Real-World Multi-Modal Camera Tracking Benchmark |
Sanghyun Woo et.al. |
2403.20225v1 |
null |
2024-03-29 |
Unleashing the Potential of Large Language Models for Predictive Tabular Tasks in Data Science |
Yazheng Yang et.al. |
2403.20208v1 |
null |
2024-03-29 |
The Future of Combating Rumors? Retrieval, Discrimination, and Generation |
Junhao Xu et.al. |
2403.20204v1 |
null |
2024-03-28 |
RSMamba: Remote Sensing Image Classification with State Space Model |
Keyan Chen et.al. |
2403.19654v1 |
link |
2024-03-28 |
Square patterns in dynamical orbits |
Vefa Goksel et.al. |
2403.19642v1 |
null |
2024-03-28 |
Siamese Vision Transformers are Scalable Audio-visual Learners |
Yan-Bo Lin et.al. |
2403.19638v1 |
null |
2024-03-28 |
Four-dimensional gradient Ricci solitons with (half) nonnegative isotropic curvature |
Huai-Dong Cao et.al. |
2403.19627v1 |
null |
2024-03-28 |
Top-$k$ Classification and Cardinality-Aware Prediction |
Anqi Mao et.al. |
2403.19625v1 |
null |
2024-03-28 |
RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization Agents |
Zeren Chen et.al. |
2403.19622v1 |
null |
2024-03-28 |
SAID-NeRF: Segmentation-AIDed NeRF for Depth Completion of Transparent Objects |
Avinash Ummadisingu et.al. |
2403.19607v1 |
null |
2024-03-28 |
Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model |
Zhicai Wang et.al. |
2403.19600v1 |
link |
2024-03-28 |
Frame by Familiar Frame: Understanding Replication in Video Diffusion Models |
Aimon Rahman et.al. |
2403.19593v1 |
null |
2024-03-28 |
Img2Loc: Revisiting Image Geolocalization using Multi-modality Foundation Models and Image-based Retrieval-Augmented Generation |
Zhongliang Zhou et.al. |
2403.19584v1 |
null |
2024-03-27 |
MetaCap: Meta-learning Priors from Multi-View Imagery for Sparse-view Human Performance Capture and Rendering |
Guoxing Sun et.al. |
2403.18820v1 |
null |
2024-03-27 |
Breaking the Limitations with Sparse Inputs by Variational Frameworks (BLIss) in Terahertz Super-Resolution 3D Reconstruction |
Yiyao Zhang et.al. |
2403.18776v1 |
null |
2024-03-27 |
CaT: Constraints as Terminations for Legged Locomotion Reinforcement Learning |
Elliot Chane-Sane et.al. |
2403.18765v1 |
null |
2024-03-27 |
A vascular synthetic model for improved aneurysm segmentation and detection via Deep Neural Networks |
Rafic Nader et.al. |
2403.18734v1 |
null |
2024-03-27 |
Contrastive Learning with Orthonormal Anchors (CLOA) |
Huanran Li et.al. |
2403.18699v1 |
null |
2024-03-27 |
Annolid: Annotate, Segment, and Track Anything You Need |
Chen Yang et.al. |
2403.18690v1 |
null |
2024-03-27 |
InceptionTime vs. Wavelet -- A comparison for time series classification |
Daniel Klenkert et.al. |
2403.18687v1 |
null |
2024-03-27 |
TransFusion: Contrastive Learning with Transformers |
Huanran Li et.al. |
2403.18681v1 |
null |
2024-03-28 |
FluxGAT: Integrating Flux Sampling with Graph Neural Networks for Unbiased Gene Essentiality Classification |
Kieren Sharma et.al. |
2403.18666v2 |
null |
2024-03-27 |
Indecomposable set-theoretical solutions to the Yang-Baxter equation of size $p^2$ |
Carsten Dietzel et.al. |
2403.18653v1 |
null |
2024-03-26 |
Efficient Video Object Segmentation via Modulated Cross-Attention Memory |
Abdelrahman Shaker et.al. |
2403.17937v1 |
link |
2024-03-26 |
ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture Synthesis |
Muhammad Hamza Mughal et.al. |
2403.17936v1 |
null |
2024-03-26 |
OmniVid: A Generative Framework for Universal Video Understanding |
Junke Wang et.al. |
2403.17935v1 |
link |
2024-03-26 |
Track Everything Everywhere Fast and Robustly |
Yunzhou Song et.al. |
2403.17931v1 |
null |
2024-03-26 |
FastCAR: Fast Classification And Regression Multi-Task Learning via Task Consolidation for Modelling a Continuous Property Variable of Object Classes |
Anoop Kini et.al. |
2403.17926v1 |
null |
2024-03-26 |
The Need for Speed: Pruning Transformers with One Recipe |
Samir Khaki et.al. |
2403.17921v1 |
link |
2024-03-26 |
TC4D: Trajectory-Conditioned Text-to-4D Generation |
Sherwin Bahmani et.al. |
2403.17920v1 |
null |
2024-03-26 |
AgentStudio: A Toolkit for Building General Virtual Agents |
Longtao Zheng et.al. |
2403.17918v1 |
null |
2024-03-26 |
Leveraging Near-Field Lighting for Monocular Depth Estimation from Endoscopy Videos |
Akshay Paruchuri et.al. |
2403.17915v1 |
null |
2024-03-26 |
Hierarchical Multi-label Classification for Fine-level Event Extraction from Aviation Accident Reports |
Xinyu Zhao et.al. |
2403.17914v1 |
null |
2024-03-25 |
DBPF: A Framework for Efficient and Robust Dynamic Bin-Picking |
Yichuan Li et.al. |
2403.16786v1 |
null |
2024-03-25 |
C-arm inverse geometry CT for 3D cardiac chamber mapping |
Jordan M. Slagowski et.al. |
2403.16779v1 |
null |
2024-03-25 |
Diff-Def: Diffusion-Generated Deformation Fields for Conditional Atlases |
Sophie Starck et.al. |
2403.16776v1 |
null |
2024-03-25 |
As Good As A Coin Toss Human detection of AI-generated images, videos, audio, and audiovisual stimuli |
Di Cooke et.al. |
2403.16760v1 |
null |
2024-03-25 |
Creating a Digital Twin of Spinal Surgery: A Proof of Concept |
Jonas Hein et.al. |
2403.16736v1 |
null |
2024-03-25 |
A Robotic Skill Learning System Built Upon Diffusion Policies and Foundation Models |
Nils Ingelhag et.al. |
2403.16730v1 |
null |
2024-03-25 |
One-Shot Domain Incremental Learning |
Yasushi Esaki et.al. |
2403.16707v1 |
null |
2024-03-25 |
Assessing the Performance of Deep Learning for Automated Gleason Grading in Prostate Cancer |
Dominik Müller et.al. |
2403.16695v1 |
null |
2024-03-25 |
DeepGleason: a System for Automated Gleason Grading of Prostate Cancer using Deep Neural Networks |
Dominik Müller et.al. |
2403.16678v1 |
link |
2024-03-25 |
FOOL: Addressing the Downlink Bottleneck in Satellite Computing with Neural Feature Compression |
Alireza Furutanpey et.al. |
2403.16677v1 |
null |
2024-03-25 |
A Novel Loss Function-based Support Vector Machine for Binary Classification |
Yan Li et.al. |
2403.16654v1 |
null |
2024-03-25 |
Self-Adaptive Reality-Guided Diffusion for Artifact-Free Super-Resolution |
Qingping Zheng et.al. |
2403.16643v1 |
null |
2024-03-25 |
Multi-Scale Texture Loss for CT denoising with GANs |
Francesco Di Feola et.al. |
2403.16640v1 |
link |
2024-03-25 |
AI-Generated Video Detection via Spatio-Temporal Anomaly Learning |
Jianfa Bai et.al. |
2403.16638v1 |
null |
2024-03-25 |
Distributed collaborative anomalous sound detection by embedding sharing |
Kota Dohi et.al. |
2403.16610v1 |
null |
2024-03-25 |
EDUE: Expert Disagreement-Guided One-Pass Uncertainty Estimation for Medical Image Segmentation |
Kudaibergen Abutalip et.al. |
2403.16594v1 |
null |
2024-03-22 |
LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models |
Yuzhang Shang et.al. |
2403.15388v1 |
null |
2024-03-22 |
Time-efficient, high-resolution 3T whole-brain relaxometry using Cartesian 3D MR-STAT with CSF suppression |
Hongyan Liu et.al. |
2403.15379v1 |
null |
2024-03-22 |
Long-CLIP: Unlocking the Long-Text Capability of CLIP |
Beichen Zhang et.al. |
2403.15378v1 |
null |
2024-03-22 |
InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding |
Yi Wang et.al. |
2403.15377v1 |
null |
2024-03-22 |
Cascading Blackout Severity Prediction with Statistically-Augmented Graph Neural Networks |
Joe Gorka et.al. |
2403.15363v1 |
null |
2024-03-22 |
SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series |
Badri N. Patro et.al. |
2403.15360v1 |
null |
2024-03-22 |
Ultrasound Imaging based on the Variance of a Diffusion Restoration Model |
Yuxin Zhang et.al. |
2403.15316v1 |
null |
2024-03-22 |
Global Control for Local SO(3)-Equivariant Scale-Invariant Vessel Segmentation |
Patryk Rygiel et.al. |
2403.15314v1 |
null |
2024-03-22 |
Quantum-inspired classification via efficient simulation of Helstrom measurement |
Wooseop Hwang et.al. |
2403.15308v1 |
null |
2024-03-22 |
Reconnaissance ultracool spectra in the Euclid Deep Fields |
Jerry Jun-Yan Zhang et.al. |
2403.15288v1 |
null |
2024-03-21 |
Language Repository for Long Video Understanding |
Kumara Kahatapitiya et.al. |
2403.14622v1 |
link |
2024-03-22 |
Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion |
Xiang Fan et.al. |
2403.14617v2 |
null |
2024-03-21 |
Explorative Inbetweening of Time and Space |
Haiwen Feng et.al. |
2403.14611v1 |
null |
2024-03-21 |
ReNoise: Real Image Inversion Through Iterative Noising |
Daniel Garibi et.al. |
2403.14602v1 |
null |
2024-03-21 |
PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model |
Zheng Zhang et.al. |
2403.14598v1 |
link |
2024-03-21 |
Large Language Models for Multi-Choice Question Classification of Medical Subjects |
Víctor Ponce-López et.al. |
2403.14582v1 |
null |
2024-03-21 |
DINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single Video |
Narek Tumanyan et.al. |
2403.14548v1 |
null |
2024-03-21 |
Estimating Physical Information Consistency of Channel Data Augmentation for Remote Sensing Images |
Tom Burgert et.al. |
2403.14547v1 |
null |
2024-03-21 |
Transfer Learning for Cross-dataset Isolated Sign Language Recognition in Under-Resourced Datasets |
Ahmet Alp Kindiroglu et.al. |
2403.14534v1 |
link |
2024-03-21 |
Invisible Needle Detection in Ultrasound: Leveraging Mechanism-Induced Vibration |
Chenyang Li et.al. |
2403.14523v1 |
null |
2024-03-21 |
Denoising Diffusion Models for 3D Healthy Brain Tissue Inpainting |
Alicia Durrer et.al. |
2403.14499v1 |
link |
2024-03-20 |
TimeRewind: Rewinding Time with Image-and-Events Video Diffusion |
Jingxi Chen et.al. |
2403.13800v1 |
null |
2024-03-20 |
Hierarchical NeuroSymbolic Approach for Action Quality Assessment |
Lauren Okamoto et.al. |
2403.13798v1 |
null |
2024-03-20 |
Bridge the Modality and Capacity Gaps in Vision-Language Model Selection |
Chao Yi et.al. |
2403.13797v1 |
null |
2024-03-20 |
The Model Openness Framework: Promoting Completeness and Openness for Reproducibility, Transparency and Usability in AI |
Matt White et.al. |
2403.13784v1 |
null |
2024-03-20 |
Gradings on associative triple systems of the second kind |
Alberto Daza-Garcia et.al. |
2403.13775v1 |
null |
2024-03-20 |
Towards Principled Representation Learning from Videos for Reinforcement Learning |
Dipendra Misra et.al. |
2403.13765v1 |
null |
2024-03-20 |
Enhancing Gait Video Analysis in Neurodegenerative Diseases by Knowledge Augmentation in Vision Language Model |
Diwei Wang et.al. |
2403.13756v1 |
null |
2024-03-20 |
Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation |
Fu-Yun Wang et.al. |
2403.13745v1 |
null |
2024-03-20 |
Probabilistic Forecasting with Stochastic Interpolants and Föllmer Processes |
Yifan Chen et.al. |
2403.13724v1 |
null |
2024-03-20 |
Improving the Adaptive Moment Estimation (ADAM) stochastic optimizer through an Implicit-Explicit (IMEX) time-stepping approach |
Abhinab Bhattacharjee et.al. |
2403.13704v1 |
null |
2024-03-19 |
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression |
Zhuoshi Pan et.al. |
2403.12968v1 |
null |
2024-03-19 |
FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation |
Shuai Yang et.al. |
2403.12962v1 |
link |
2024-03-19 |
WHAC: World-grounded Humans and Cameras |
Wanqi Yin et.al. |
2403.12959v1 |
null |
2024-03-19 |
FutureDepth: Learning to Predict the Future Improves Video Depth Estimation |
Rajeev Yasarla et.al. |
2403.12953v1 |
null |
2024-03-19 |
Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models |
Elaine Sui et.al. |
2403.12952v1 |
link |
2024-03-19 |
Legendrian loops and cluster modular groups |
James Hughes et.al. |
2403.12951v1 |
null |
2024-03-19 |
Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers |
Vidhi Jain et.al. |
2403.12943v1 |
null |
2024-03-19 |
Contextual AD Narration with Interleaved Multimodal Sequence |
Hanlin Wang et.al. |
2403.12922v1 |
null |
2024-03-19 |
Semantic Layering in Room Segmentation via LLMs |
Taehyeon Kim et.al. |
2403.12920v1 |
null |
2024-03-19 |
Yell At Your Robot: Improving On-the-Fly from Language Corrections |
Lucy Xiaoyang Shi et.al. |
2403.12910v1 |
null |
2024-03-18 |
Time Series Compression using Quaternion Valued Neural Networks and Quaternion Backpropagation |
Johannes Pöppelbaum et.al. |
2403.11722v1 |
null |
2024-03-18 |
Virbo: Multimodal Multilingual Avatar Video Generation in Digital Marketing |
Juan Zhang et.al. |
2403.11700v1 |
null |
2024-03-18 |
A Spatial-Temporal Progressive Fusion Network for Breast Lesion Segmentation in Ultrasound Videos |
Zhengzheng Tu et.al. |
2403.11699v1 |
null |
2024-03-18 |
Object Segmentation-Assisted Inter Prediction for Versatile Video Coding |
Zhuoyuan Li et.al. |
2403.11694v1 |
null |
2024-03-19 |
MoreStyle: Relax Low-frequency Constraint of Fourier-based Image Reconstruction in Generalizable Medical Image Segmentation |
Haoyu Zhao et.al. |
2403.11689v2 |
null |
2024-03-18 |
Better (pseudo-)labels for semi-supervised instance segmentation |
François Porcher et.al. |
2403.11675v1 |
null |
2024-03-19 |
WIA-LD2ND: Wavelet-based Image Alignment for Self-supervised Low-Dose CT Denoising |
Haoyu Zhao et.al. |
2403.11672v2 |
null |
2024-03-18 |
Binary Noise for Binary Tasks: Masked Bernoulli Diffusion for Unsupervised Anomaly Detection |
Julia Wolleb et.al. |
2403.11667v1 |
null |
2024-03-18 |
Combining Local and Global Perception for Autonomous Navigation on Nano-UAVs |
Lorenzo Lamberti et.al. |
2403.11661v1 |
null |
2024-03-18 |
LocalStyleFool: Regional Video Style Transfer Attack Using Segment Anything Model |
Yuxin Cao et.al. |
2403.11656v1 |
null |
2024-03-15 |
Strong and Controllable Blind Image Decomposition |
Zeyu Zhang et.al. |
2403.10520v1 |
link |
2024-03-15 |
Frozen Feature Augmentation for Few-Shot Image Classification |
Andreas Bär et.al. |
2403.10519v1 |
null |
2024-03-15 |
VideoAgent: Long-form Video Understanding with Large Language Model as Agent |
Xiaohan Wang et.al. |
2403.10517v1 |
null |
2024-03-15 |
Surveyor: Facilitating Discovery Within Video Games for Blind and Low Vision Players |
Vishnu Nair et.al. |
2403.10512v1 |
null |
2024-03-15 |
Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study |
Chenguang Wang et.al. |
2403.10499v1 |
link |
2024-03-15 |
Joint Multimodal Transformer for Dimensional Emotional Recognition in the Wild |
Paul Waligora et.al. |
2403.10488v1 |
null |
2024-03-15 |
Tensor Star Decomposition |
Wuyang Zhou et.al. |
2403.10481v1 |
null |
2024-03-15 |
Using an LLM to Turn Sign Spottings into Spoken Language Sentences |
Ozge Mercanoglu Sincan et.al. |
2403.10434v1 |
null |
2024-03-15 |
Neural Networks Hear You Loud And Clear: Hearing Loss Compensation Using Deep Neural Networks |
Peter Leer et.al. |
2403.10420v1 |
null |
2024-03-15 |
A comparative study on machine learning approaches for rock mass classification using drilling data |
Tom F. Hansen et.al. |
2403.10404v1 |
null |
2024-03-14 |
Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models |
Akhil Kedia et.al. |
2403.09635v1 |
link |
2024-03-14 |
Generalized Predictive Model for Autonomous Driving |
Jiazhi Yang et.al. |
2403.09630v1 |
link |
2024-03-14 |
From the Conformal Anomaly to the Virasoro Algebra |
Sid Maibach et.al. |
2403.09628v1 |
null |
2024-03-14 |
Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding |
Guo Chen et.al. |
2403.09626v1 |
link |
2024-03-14 |
Score-Guided Diffusion for 3D Human Recovery |
Anastasis Stathopoulos et.al. |
2403.09623v1 |
link |
2024-03-14 |
PosSAM: Panoptic Open-vocabulary Segment Anything |
Vibashan VS et.al. |
2403.09620v1 |
null |
2024-03-14 |
Explore In-Context Segmentation via Latent Diffusion Models |
Chaoyang Wang et.al. |
2403.09616v1 |
null |
2024-03-14 |
Compute-first optical detection for noise-resilient visual perception |
Jungmin Kim et.al. |
2403.09612v1 |
null |
2024-03-14 |
Mixture of Mixups for Multi-label Classification of Rare Anuran Sounds |
Ilyass Moummad et.al. |
2403.09598v1 |
link |
2024-03-14 |
DungeonMaker: Embedding Tangible Creation and Destruction in Hybrid Board Games through Personal Fabrication Technology |
Evgeny Stemasov et.al. |
2403.09592v1 |
null |
2024-03-13 |
VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis |
Enric Corona et.al. |
2403.08764v1 |
null |
2024-03-13 |
Segmentation of Knee Bones for Osteoarthritis Assessment: A Comparative Analysis of Supervised, Few-Shot, and Zero-Shot Learning Approaches |
Yun Xin Teoh et.al. |
2403.08761v1 |
null |
2024-03-13 |
MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning |
Jialv Zou et.al. |
2403.08760v1 |
link |
2024-03-13 |
Spatiotemporal Diffusion Model with Paired Sampling for Accelerated Cardiac Cine MRI |
Shihan Qiu et.al. |
2403.08758v1 |
null |
2024-03-13 |
DAM: Dynamic Adapter Merging for Continual Video QA Learning |
Feng Cheng et.al. |
2403.08755v1 |
link |
2024-03-13 |
Clinically Feasible Diffusion Reconstruction for Highly-Accelerated Cardiac Cine MRI |
Shihan Qiu et.al. |
2403.08749v1 |
null |
2024-03-13 |
Torsion pairs, t-structures, and co-t-structures for completions of discrete cluster categories |
Sofia Franchini et.al. |
2403.08735v1 |
null |
2024-03-13 |
Euclid: Testing photometric selection of emission-line galaxy targets |
M. S. Cagliari et.al. |
2403.08726v1 |
null |
2024-03-13 |
Diffusion-based Iterative Counterfactual Explanations for Fetal Ultrasound Image Quality Assessment |
Paraskevas Pegios et.al. |
2403.08700v1 |
null |
2024-03-13 |
Implicit Regularization of Gradient Flow on One-Layer Softmax Attention |
Heejune Sheen et.al. |
2403.08699v1 |
null |
2024-03-12 |
OPEN TEACH: A Versatile Teleoperation System for Robotic Manipulation |
Aadhithya Iyer et.al. |
2403.07870v1 |
null |
2024-03-12 |
TeleMoMa: A Modular and Versatile Teleoperation System for Mobile Manipulation |
Shivin Dass et.al. |
2403.07869v1 |
null |
2024-03-12 |
Iterative Graph Neural Network Enhancement via Frequent Subgraph Mining of Explanations |
Harish G. Naik et.al. |
2403.07849v1 |
null |
2024-03-12 |
When Eye-Tracking Meets Machine Learning: A Systematic Review on Applications in Medical Image Analysis |
Sahar Moradizeyveh et.al. |
2403.07834v1 |
null |
2024-03-12 |
DeliGrasp: Inferring Object Mass, Friction, and Compliance with LLMs for Adaptive and Minimally Deforming Grasp Policies |
William Xie et.al. |
2403.07832v1 |
null |
2024-03-12 |
A geometric model for the module category of a string algebra |
Karin Baur et.al. |
2403.07810v1 |
null |
2024-03-12 |
BraSyn 2023 challenge: Missing MRI synthesis and the effect of different learning objectives |
Ivo M. Baltruschat et.al. |
2403.07800v1 |
null |
2024-03-12 |
A robust SVM-based approach with feature selection and outliers detection for classification problems |
Marta Baldomero-Naranjo et.al. |
2403.07753v1 |
null |
2024-03-12 |
Vision-based Vehicle Re-identification in Bridge Scenario using Flock Similarity |
Chunfeng Zhang et.al. |
2403.07752v1 |
null |
2024-03-12 |
Harnessing two-photon dissipation for enhanced quantum measurement and control |
Antoine Marquet et.al. |
2403.07744v1 |
null |
2024-03-11 |
Attention Prompt Tuning: Parameter-efficient Adaptation of Pre-trained Models for Spatiotemporal Modeling |
Wele Gedara Chaminda Bandara et.al. |
2403.06978v1 |
link |
2024-03-12 |
VideoMamba: State Space Model for Efficient Video Understanding |
Kunchang Li et.al. |
2403.06977v2 |
link |
2024-03-11 |
Memory-based Adapters for Online 3D Scene Perception |
Xiuwei Xu et.al. |
2403.06974v1 |
null |
2024-03-11 |
Explainable Transformer Prototypes for Medical Diagnoses |
Ugur Demir et.al. |
2403.06961v1 |
link |
2024-03-11 |
Quadruped-Frog: Rapid Online Optimization of Continuous Quadruped Jumping |
Guillaume Bellegarda et.al. |
2403.06954v1 |
null |
2024-03-11 |
Optimizing Latent Graph Representations of Surgical Scenes for Zero-Shot Domain Transfer |
Siddhant Satyanaik et.al. |
2403.06953v1 |
null |
2024-03-11 |
Advancing Generalizable Remote Physiological Measurement through the Integration of Explicit and Implicit Prior Knowledge |
Yuting Zhang et.al. |
2403.06947v1 |
link |
2024-03-11 |
Conditional Score-Based Diffusion Model for Cortical Thickness Trajectory Prediction |
Qing Xiao et.al. |
2403.06940v1 |
null |
2024-03-11 |
FocusCLIP: Multimodal Subject-Level Guidance for Zero-Shot Transfer in Human-Centric Tasks |
Muhammad Saif Ullah Khan et.al. |
2403.06904v1 |
null |
2024-03-11 |
Benign overfitting in leaky ReLU networks with moderate input dimension |
Kedar Karhadkar et.al. |
2403.06903v1 |
null |
2024-03-08 |
Tell, Don't Show!: Language Guidance Eases Transfer Across Domains in Images and Videos |
Tarun Kalluri et.al. |
2403.05535v1 |
null |
2024-03-08 |
Tune without Validation: Searching for Learning Rate and Weight Decay on Training Sets |
Lorenzo Brigato et.al. |
2403.05532v1 |
null |
2024-03-08 |
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context |
Machel Reid et.al. |
2403.05530v1 |
null |
2024-03-08 |
Take Your Best Shot: Sampling-Based Next-Best-View Planning for Autonomous Photography & Inspection |
Shijie Gao et.al. |
2403.05477v1 |
null |
2024-03-08 |
Will GPT-4 Run DOOM? |
Adrian de Wynter et.al. |
2403.05468v1 |
null |
2024-03-08 |
Evaluating AI and Human Authorship Quality in Academic Writing through Physics Essays |
Will Yeadon et.al. |
2403.05458v1 |
null |
2024-03-08 |
VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models |
Yabo Zhang et.al. |
2403.05438v1 |
link |
2024-03-08 |
OmniCount: Multi-label Object Counting with Semantic-Geometric Priors |
Anindya Mondal et.al. |
2403.05435v1 |
null |
2024-03-08 |
Infinite Translation Surfaces in the Wild |
Vincent Delecroix et.al. |
2403.05424v1 |
null |
2024-03-08 |
Rethinking Transformers Pre-training for Multi-Spectral Satellite Imagery |
Mubashir Noman et.al. |
2403.05419v1 |
link |
2024-03-07 |
DeepSee: Multidimensional Visualizations of Seabed Ecosystems |
Adam Coscia et.al. |
2403.04761v1 |
link |
2024-03-07 |
iScore: Visual Analytics for Interpreting How Language Models Automatically Score Summaries |
Adam Coscia et.al. |
2403.04760v1 |
link |
2024-03-07 |
KnowledgeVIS: Interpreting Language Models by Comparing Fill-in-the-Blank Prompts |
Adam Coscia et.al. |
2403.04758v1 |
link |
2024-03-07 |
Preliminary Guidelines For Combining Data Integration and Visual Data Analysis |
Adam Coscia et.al. |
2403.04757v1 |
link |
2024-03-07 |
Photonic probabilistic machine learning using quantum vacuum noise |
Seou Choi et.al. |
2403.04731v1 |
null |
2024-03-07 |
Analysis of Systems' Performance in Natural Language Processing Competitions |
Sergio Nava-Muñoz et.al. |
2403.04693v1 |
null |
2024-03-07 |
CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios |
Qilang Ye et.al. |
2403.04640v1 |
link |
2024-03-07 |
Scalable, Simulation-Guided Compliant Tactile Finger Design |
Yuxiang Ma et.al. |
2403.04638v1 |
null |
2024-03-08 |
Pix2Gif: Motion-Guided Diffusion for GIF Generation |
Hitesh Kandala et.al. |
2403.04634v2 |
null |
2024-03-07 |
MedFLIP: Medical Vision-and-Language Self-supervised Fast Pre-Training with Masked Autoencoder |
Lei Li et.al. |
2403.04626v1 |
null |
2024-03-06 |
3D Diffusion Policy |
Yanjie Ze et.al. |
2403.03954v1 |
link |
2024-03-06 |
Stop Regressing: Training Value Functions via Classification for Scalable Deep RL |
Jesse Farebrother et.al. |
2403.03950v1 |
null |
2024-03-06 |
Reconciling Reality through Simulation: A Real-to-Sim-to-Real Approach for Robust Manipulation |
Marcel Torne et.al. |
2403.03949v1 |
null |
2024-03-06 |
DART: Implicit Doppler Tomography for Radar Novel View Synthesis |
Tianshu Huang et.al. |
2403.03896v1 |
null |
2024-03-06 |
Joint multi-task learning improves weakly-supervised biomarker prediction in computational pathology |
Omar S. M. El Nahhas et.al. |
2403.03891v1 |
link |
2024-03-06 |
Hierarchical Diffusion Policy for Kinematics-Aware Multi-Task Robotic Manipulation |
Xiao Ma et.al. |
2403.03890v1 |
null |
2024-03-06 |
Decoupled Vertical Federated Learning for Practical Training on Vertically Partitioned Data |
Avi Amalanshu et.al. |
2403.03871v1 |
null |
2024-03-06 |
X-Shot: A Unified System to Handle Frequent, Few-shot and Zero-shot Learning Simultaneously in Classification |
Hanzi Xu et.al. |
2403.03863v1 |
link |
2024-03-06 |
ProxNF: Neural Field Proximal Training for High-Resolution 4D Dynamic Image Reconstruction |
Luke Lozenski et.al. |
2403.03860v1 |
null |
2024-03-06 |
MedMamba: Vision Mamba for Medical Image Classification |
Yubiao Yue et.al. |
2403.03849v1 |
link |
2024-03-05 |
Extension Theory and Fermionic Strongly Fusion 2-Categories |
Thibault D. Décoppet et.al. |
2403.03211v1 |
null |
2024-03-05 |
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis |
Patrick Esser et.al. |
2403.03206v1 |
null |
2024-03-05 |
Behavior Generation with Latent Actions |
Seungjae Lee et.al. |
2403.03181v1 |
link |
2024-03-05 |
Deep-Learned Compression for Radio-Frequency Signal Classification |
Armani Rodriguez et.al. |
2403.03150v1 |
null |
2024-03-05 |
Dual Mean-Teacher: An Unbiased Semi-Supervised Framework for Audio-Visual Source Localization |
Yuxin Guo et.al. |
2403.03145v1 |
link |
2024-03-05 |
Motion-Corrected Moving Average: Including Post-Hoc Temporal Information for Improved Video Segmentation |
Robert Mendel et.al. |
2403.03120v1 |
null |
2024-03-05 |
Equilibria in Two-Stage Facility Location with Atomic Clients |
Simon Krogmann et.al. |
2403.03114v1 |
null |
2024-03-05 |
Galaxies in the Zone of Avoidance: Misclassifications using machine learning tools |
P. Marchant Cortés et.al. |
2403.03098v1 |
null |
2024-03-05 |
Collective self-caging of active filaments in virtual confinement |
Maximilian Kurjahn et.al. |
2403.03093v1 |
null |
2024-03-05 |
A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives |
Simone Alberto Peirone et.al. |
2403.03037v1 |
null |
2024-03-03 |
Enhancing Retinal Vascular Structure Segmentation in Images With a Novel Design Two-Path Interactive Fusion Module Model |
Rui Yang et.al. |
2403.01362v1 |
null |
2024-03-02 |
Improve Cost Efficiency of Active Learning over Noisy Dataset |
Zan-Kai Chong et.al. |
2403.01346v1 |
null |
2024-03-02 |
An eternal hypersurface flow arising in centro-affine geometry |
Xinjie Jiang et.al. |
2403.01340v1 |
null |
2024-03-02 |
Image-Based Dietary Assessment: A Healthy Eating Plate Estimation System |
Assylzhan Izbassar et.al. |
2403.01310v1 |
null |
2024-03-02 |
VNLP: Turkish NLP Package |
Meliksah Turker et.al. |
2403.01309v1 |
null |
2024-03-02 |
Towards a classification of $p^2$-discriminant ideal twins over number fields |
Alyson Deines et.al. |
2403.01287v1 |
null |
2024-03-02 |
$π$-systems and the Embedding problem for rank $2$ Kac-Moody Lie algebras |
Irfan Habib et.al. |
2403.01285v1 |
null |
2024-03-02 |
Fast Low-parameter Video Activity Localization in Collaborative Learning Environments |
Venkatesh Jatla et.al. |
2403.01281v1 |
null |
2024-03-02 |
Rigidity results for group von Neumann algebras with diffuse center |
Ionuţ Chifan et.al. |
2403.01280v1 |
null |
2024-03-02 |
Can a Confident Prior Replace a Cold Posterior? |
Martin Marek et.al. |
2403.01272v1 |
link |
2024-02-29 |
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers |
Tsai-Shien Chen et.al. |
2402.19479v1 |
null |
2024-02-29 |
Towards Generalizable Tumor Synthesis |
Qi Chen et.al. |
2402.19470v1 |
null |
2024-02-29 |
Humanoid Locomotion as Next Token Prediction |
Ilija Radosavovic et.al. |
2402.19469v1 |
null |
2024-03-01 |
TV-TREES: Multimodal Entailment Trees for Neuro-Symbolic Video Reasoning |
Kate Sanders et.al. |
2402.19467v2 |
null |
2024-02-29 |
Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models |
Frederik Kunstner et.al. |
2402.19449v1 |
null |
2024-02-29 |
Probing the Information Encoded in Neural-based Acoustic Models of Automatic Speech Recognition Systems |
Quentin Raymondaud et.al. |
2402.19443v1 |
null |
2024-02-29 |
Pushing the Limits of Cross-Embodiment Learning for Manipulation and Navigation |
Jonathan Yang et.al. |
2402.19432v1 |
null |
2024-02-29 |
PaECTER: Patent-level Representation Learning using Citation-informed Transformers |
Mainak Ghosh et.al. |
2402.19411v1 |
null |
2024-02-29 |
Navigating Hallucinations for Reasoning of Unintentional Activities |
Shresth Grover et.al. |
2402.19405v1 |
null |
2024-02-29 |
A Newborn AGN in a Starforming Galaxy |
P. Arévalo et.al. |
2402.19403v1 |
null |
2024-02-28 |
Time-efficient filtering of polarimetric data by checking physical realizability of experimental Mueller matrices |
Tatiana Novikova et.al. |
2402.18555v1 |
null |
2024-02-28 |
Selection of appropriate multispectral camera exposure settings and radiometric calibration methods for applications in phenotyping and precision agriculture |
Vaishali Swaminathan et.al. |
2402.18553v1 |
null |
2024-02-28 |
Implicit Bias of Next-Token Prediction |
Christos Thrampoulidis et.al. |
2402.18551v1 |
null |
2024-02-28 |
Defect Detection in Tire X-Ray Images: Conventional Methods Meet Deep Structures |
Andrei Cozma et.al. |
2402.18527v1 |
null |
2024-02-28 |
Do galaxy mergers prefer under-dense environments? |
U. Sureshkumar et.al. |
2402.18520v1 |
null |
2024-02-28 |
Log Neural Controlled Differential Equations: The Lie Brackets Make a Difference |
Benjamin Walker et.al. |
2402.18512v1 |
null |
2024-02-28 |
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling |
Mahdi Karami et.al. |
2402.18508v1 |
null |
2024-02-28 |
Detection of Micromobility Vehicles in Urban Traffic Videos |
Khalil Sabri et.al. |
2402.18503v1 |
link |
2024-02-28 |
Few-Shot Fairness: Unveiling LLM's Potential for Fairness-Aware Classification |
Garima Chhikara et.al. |
2402.18502v1 |
null |
2024-02-28 |
ROG$_{PL}$: Robust Open-Set Graph Learning via Region-Based Prototype Learning |
Qin Zhang et.al. |
2402.18495v1 |
null |
2024-02-27 |
Diffusion Meets DAgger: Supercharging Eye-in-hand Imitation Learning |
Xiaoyu Zhang et.al. |
2402.17768v1 |
null |
2024-02-27 |
Towards Optimal Learning of Language Models |
Yuxian Gu et.al. |
2402.17759v1 |
null |
2024-02-27 |
An Eye Gaze Heatmap Analysis of Uncertainty Head-Up Display Designs for Conditional Automated Driving |
Michael A. Gerber et.al. |
2402.17751v1 |
null |
2024-02-27 |
Scaling on-chip photonic neural processors using arbitrarily programmable wave propagation |
Tatsuhiro Onodera et.al. |
2402.17750v1 |
link |
2024-02-27 |
Linking Order to Strength in Metals |
Nicolas Argibay et.al. |
2402.17728v1 |
null |
2024-02-27 |
MedContext: Learning Contextual Cues for Efficient Volumetric Medical Segmentation |
Hanan Gani et.al. |
2402.17725v1 |
link |
2024-02-27 |
Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners |
Yazhou Xing et.al. |
2402.17723v1 |
null |
2024-02-27 |
Understanding Neural Network Binarization with Forward and Backward Proximal Quantizers |
Yiwei Lu et.al. |
2402.17710v1 |
null |
2024-02-27 |
NextLevelBERT: Investigating Masked Language Modeling with Higher-Level Representations for Long Documents |
Tamara Czinczoll et.al. |
2402.17682v1 |
null |
2024-02-27 |
MCF-VC: Mitigate Catastrophic Forgetting in Class-Incremental Learning for Multimodal Video Captioning |
Huiyu Xiong et.al. |
2402.17680v1 |
null |
2024-02-26 |
Open Your Ears to Take a Look: A State-of-the-Art Report on the Integration of Sonification and Visualization |
Kajetan Enge et.al. |
2402.16558v1 |
null |
2024-02-26 |
LLM-based Privacy Data Augmentation Guided by Knowledge Distillation with a Distribution Tutor for Medical Text Classification |
Yiping Song et.al. |
2402.16515v1 |
null |
2024-02-26 |
Photonic Neural Network Fabricated on Thin Film Lithium Niobate for High-Fidelity and Power-Efficient Matrix Computation |
Yong Zheng et.al. |
2402.16513v1 |
null |
2024-02-26 |
Intelligent Known and Novel Aircraft Recognition -- A Shift from Classification to Similarity Learning for Combat Identification |
Ahmad Saeed et.al. |
2402.16486v1 |
null |
2024-02-26 |
Edge Detectors Can Make Deep Convolutional Neural Networks More Robust |
Jin Ding et.al. |
2402.16479v1 |
null |
2024-02-26 |
Autonomous Integration of TSN-unaware Applications with QoS Requirements in TSN Networks |
Moritz Fluechter et.al. |
2402.16454v1 |
null |
2024-02-26 |
Retrouver l'inventeur-auteur : la lev{é}e d'homonymies d'autorat entre les brevets et les publications scientifiques |
David Reymond et.al. |
2402.16440v1 |
null |
2024-02-26 |
Improving behavior based authentication against adversarial attack using XAI |
Dong Qin et.al. |
2402.16430v1 |
null |
2024-02-26 |
Adaptive Online Learning of Separable Path Graph Transforms for Intra-prediction |
Wen-Yang Lu et.al. |
2402.16371v1 |
null |
2024-02-26 |
DEYO: DETR with YOLO for End-to-End Object Detection |
Haodong Ouyang et.al. |
2402.16370v1 |
null |
2024-02-26 |
SPINEPS -- Automatic Whole Spine Segmentation of T2-weighted MR images using a Two-Phase Approach to Multi-class Semantic and Instance Segmentation |
Hendrik Möller et.al. |
2402.16368v1 |
link |
2024-02-26 |
An Integrated Data Processing Framework for Pretraining Foundation Models |
Yiding Sun et.al. |
2402.16358v1 |
link |
2024-02-26 |
What Text Design Characterizes Book Genres? |
Daichi Haraguchi et.al. |
2402.16356v1 |
null |
2024-02-23 |
A Comprehensive Survey of Convolutions in Deep Learning: Applications, Challenges, and Future Trends |
Abolfazl Younesi et.al. |
2402.15490v1 |
null |
2024-02-23 |
Retinotopic Mapping Enhances the Robustness of Convolutional Neural Networks |
Jean-Nicolas Jérémie et.al. |
2402.15480v1 |
null |
2024-02-23 |
FAIR: Filtering of Automatically Induced Rules |
Divya Jyoti Bajpai et.al. |
2402.15472v1 |
null |
2024-02-23 |
GROS: A General Robust Aggregation Strategy |
Alejandro Cholaquidis et.al. |
2402.15442v1 |
null |
2024-02-23 |
Hierarchical Invariance for Robust and Interpretable Vision Tasks at Larger Scales |
Shuren Qi et.al. |
2402.15430v1 |
link |
2024-02-23 |
ProTIP: Probabilistic Robustness Verification on Text-to-Image Diffusion Models against Stochastic Perturbation |
Yi Zhang et.al. |
2402.15429v1 |
link |
2024-02-23 |
Understanding Entrainment in Human Groups: Optimising Human-Robot Collaboration from Lessons Learned during Human-Human Collaboration |
Eike Schneiders et.al. |
2402.15427v1 |
null |
2024-02-23 |
PREDILECT: Preferences Delineated with Zero-Shot Language-based Reasoning in Reinforcement Learning |
Simon Holk et.al. |
2402.15420v1 |
null |
2024-02-23 |
G-RepsNet: A Fast and General Construction of Equivariant Networks for Arbitrary Matrix Groups |
Sourya Basu et.al. |
2402.15413v1 |
null |
2024-02-23 |
A Universal Method for Solar Filament Detection from H-alpha Observations using Semi-supervised Deep Learning |
Andrea Diercke et.al. |
2402.15407v1 |
null |
2024-02-22 |
Link Prediction under Heterophily: A Physics-Inspired Graph Neural Network Approach |
Andrea Giuseppe Di Francesco et.al. |
2402.14802v1 |
null |
2024-02-22 |
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis |
Willi Menapace et.al. |
2402.14797v1 |
null |
2024-02-22 |
Customize-A-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models |
Yixuan Ren et.al. |
2402.14780v1 |
null |
2024-02-22 |
Zero-Shot Pediatric Tuberculosis Detection in Chest X-Rays using Self-Supervised Learning |
Daniel Capellán-Martín et.al. |
2402.14741v1 |
null |
2024-02-22 |
Solitons of the mean curvature flow in $\mathbb{s}^2\times\mathbb{R}$ |
Rafael López et.al. |
2402.14727v1 |
null |
2024-02-22 |
A Transformer Model for Boundary Detection in Continuous Sign Language |
Razieh Rastgoo et.al. |
2402.14720v1 |
null |
2024-02-22 |
InfFeed: Influence Functions as a Feedback to Improve the Performance of Subjective Tasks |
Somnath Banerjee et.al. |
2402.14702v1 |
null |
2024-02-22 |
Big data analytics to classify earthwork-related locations: A Chengdu study |
Lei Yu et.al. |
2402.14698v1 |
null |
2024-02-22 |
Rethinking Invariance Regularization in Adversarial Training to Improve Robustness-Accuracy Trade-off |
Futa Waseda et.al. |
2402.14648v1 |
null |
2024-02-22 |
Distributed Radiance Fields for Edge Video Compression and Metaverse Integration in Autonomous Driving |
Eugen Šlapak et.al. |
2402.14642v1 |
null |
2024-02-21 |
A Simple and Yet Fairly Effective Defense for Graph Neural Networks |
Sofiane Ennadir et.al. |
2402.13987v1 |
link |
2024-02-21 |
On modular representations of inner forms of $\mathrm{GL}_n$ over a local non-archimedean field |
Johannes Droschl et.al. |
2402.13969v1 |
null |
2024-02-21 |
New directions in algebraic statistics: Three challenges from 2023 |
Yulia Alexandr et.al. |
2402.13961v1 |
null |
2024-02-21 |
On the topological classification of complex plane curve singularities |
Alberto Fernández-Hernández et.al. |
2402.13941v1 |
null |
2024-02-21 |
Verifying message-passing neural networks via topology-based bounds tightening |
Christopher Hojny et.al. |
2402.13937v1 |
null |
2024-02-21 |
Tumor segmentation on whole slide images: training or prompting? |
Huaqian Wu et.al. |
2402.13932v1 |
null |
2024-02-21 |
BenchCloudVision: A Benchmark Analysis of Deep Learning Approaches for Cloud Detection and Segmentation in Remote Sensing Imagery |
Loddo Fabio et.al. |
2402.13918v1 |
link |
2024-02-21 |
An Explainable Transformer-based Model for Phishing Email Detection: A Large Language Model Approach |
Mohammad Amaz Uddin et.al. |
2402.13871v1 |
null |
2024-02-21 |
RFI-DRUnet: Restoring dynamic spectra corrupted by radio frequency interference -- Application to pulsar observations |
Xiao Zhang et.al. |
2402.13867v1 |
null |
2024-02-21 |
What we can learn from TikTok through its Research API |
Francesco Corso et.al. |
2402.13855v1 |
null |
2024-02-20 |
Video ReCap: Recursive Captioning of Hour-Long Videos |
Md Mohaiminul Islam et.al. |
2402.13250v1 |
null |
2024-02-20 |
SMORE: Similarity-based Hyperdimensional Domain Adaptation for Multi-Sensor Time Series Classification |
Junyao Wang et.al. |
2402.13233v1 |
null |
2024-02-20 |
A Touch, Vision, and Language Dataset for Multimodal Alignment |
Letian Fu et.al. |
2402.13232v1 |
null |
2024-02-20 |
NeRF Solves Undersampled MRI Reconstruction |
Tae Jun Jang et.al. |
2402.13226v1 |
null |
2024-02-20 |
VideoPrism: A Foundational Visual Encoder for Video Understanding |
Long Zhao et.al. |
2402.13217v1 |
null |
2024-02-20 |
How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena |
Marco Gaido et.al. |
2402.13208v1 |
null |
2024-02-20 |
A novel image correction method for cloud-affected observations with Imaging Atmospheric Cherenkov Telescopes |
Natalia Żywucka et.al. |
2402.13190v1 |
null |
2024-02-20 |
UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing |
Jianhong Bai et.al. |
2402.13185v1 |
null |
2024-02-20 |
DINOBot: Robot Manipulation via Retrieval and Alignment with Vision Foundation Models |
Norman Di Palo et.al. |
2402.13181v1 |
null |
2024-02-20 |
3D Kinematics Estimation from Video with a Biomechanical Model and Synthetic Training Data |
Zhi-Yi Lin et.al. |
2402.13172v1 |
null |
2024-02-19 |
Short-Period Variables in TESS Full-Frame Image Light Curves Identified via Convolutional Neural Networks |
Greg Olmschenk et.al. |
2402.12369v1 |
null |
2024-02-19 |
The first all-sky survey of star-forming galaxies with eROSITA: Scaling relations and a population of X-ray luminous starbursts |
E. Kyritsis et.al. |
2402.12367v1 |
null |
2024-02-19 |
An Adversarial Approach to Evaluating the Robustness of Event Identification Models |
Obai Bahwal et.al. |
2402.12338v1 |
null |
2024-02-19 |
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models |
Christian Schlarmann et.al. |
2402.12336v1 |
link |
2024-02-19 |
Generating Survival Interpretable Trajectories and Data |
Andrei V. Konstantinov et.al. |
2402.12331v1 |
null |
2024-02-19 |
Asymptotic Gaussian Fluctuations of Eigenvectors in Spectral Clustering |
Hugo Lebeau et.al. |
2402.12302v1 |
null |
2024-02-19 |
Time-periodic behaviour in one- and two-dimensional interacting particle systems |
Jonas Köppl et.al. |
2402.12300v1 |
null |
2024-02-19 |
Is Open-Source There Yet? A Comparative Study on Commercial and Open-Source LLMs in Their Ability to Label Chest X-Ray Reports |
Felix J. Dorfner et.al. |
2402.12298v1 |
null |
2024-02-19 |
Revisiting registration-based synthesis: A focus on unsupervised MR image synthesis |
Savannah P. Hays et.al. |
2402.12288v1 |
null |
2024-02-19 |
Significance of Chirp MFCC as a Feature in Speech and Audio Applications |
S. Johanan Joysingh et.al. |
2402.12239v1 |
null |
2024-02-16 |
PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter |
Junfei Xiao et.al. |
2402.10896v1 |
null |
2024-02-16 |
Fusion of Diffusion Weighted MRI and Clinical Data for Predicting Functional Outcome after Acute Ischemic Stroke with Deep Contrastive Learning |
Chia-Ling Tsai et.al. |
2402.10894v1 |
null |
2024-02-16 |
Weak-Mamba-UNet: Visual Mamba Makes CNN and ViT Work Better for Scribble-based Medical Image Segmentation |
Ziyang Wang et.al. |
2402.10887v1 |
link |
2024-02-16 |
Control Color: Multimodal Diffusion-based Interactive Image Colorization |
Zhexin Liang et.al. |
2402.10855v1 |
null |
2024-02-16 |
HistoSegCap: Capsules for Weakly-Supervised Semantic Segmentation of Histological Tissue Type in Whole Slide Images |
Mobina Mansoori et.al. |
2402.10851v1 |
null |
2024-02-16 |
FedD2S: Personalized Data-Free Federated Knowledge Distillation |
Kawa Atapour et.al. |
2402.10846v1 |
null |
2024-02-16 |
Pedipulate: Enabling Manipulation Skills using a Quadruped Robot's Leg |
Philip Arm et.al. |
2402.10837v1 |
null |
2024-02-16 |
GAN-driven Electromagnetic Imaging of 2-D Dielectric Scatterers |
Ehtasham Naseer et.al. |
2402.10831v1 |
null |
2024-02-16 |
Structure results for torus fixed loci |
Jarod Alper et.al. |
2402.10823v1 |
null |
2024-02-16 |
Training Class-Imbalanced Diffusion Model Via Overlap Optimization |
Divin Yan et.al. |
2402.10821v1 |
link |
2024-02-15 |
Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling |
Raunaq Bhirangi et.al. |
2402.10211v1 |
null |
2024-02-15 |
FedAnchor: Enhancing Federated Semi-Supervised Learning with Label Contrastive Loss for Unlabeled Clients |
Xinchi Qiu et.al. |
2402.10191v1 |
null |
2024-02-15 |
Euclid preparation. Measuring detailed galaxy morphologies for Euclid with Machine Learning |
Euclid Collaboration et.al. |
2402.10187v1 |
link |
2024-02-15 |
DeepSRGM -- Sequence Classification and Ranking in Indian Classical Music with Deep Learning |
Sathwik Tejaswi Madhusudhan et.al. |
2402.10168v1 |
null |
2024-02-15 |
Holographic covering and the fortuity of black holes |
Chi-Ming Chang et.al. |
2402.10129v1 |
null |
2024-02-15 |
Classification Diffusion Models |
Shahar Yadin et.al. |
2402.10095v1 |
null |
2024-02-15 |
MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations |
Benedikt Alkin et.al. |
2402.10093v1 |
link |
2024-02-15 |
GraphCBAL: Class-Balanced Active Learning for Graph Neural Networks via Reinforcement Learning |
Chengcheng Yu et.al. |
2402.10074v1 |
null |
2024-02-15 |
Both Matter: Enhancing the Emotional Intelligence of Large Language Models without Compromising the General Intelligence |
Weixiang Zhao et.al. |
2402.10073v1 |
null |
2024-02-15 |
NYCTALE: Neuro-Evidence Transformer for Adaptive and Personalized Lung Nodule Invasiveness Prediction |
Sadaf Khademi et.al. |
2402.10066v1 |
null |
2024-02-14 |
LL-GABR: Energy Efficient Live Video Streaming Using Reinforcement Learning |
Adithya Raman et.al. |
2402.09392v1 |
null |
2024-02-14 |
GraSSRep: Graph-Based Self-Supervised Learning for Repeat Detection in Metagenomic Assembly |
Ali Azizpour et.al. |
2402.09381v1 |
link |
2024-02-14 |
Deep Rib Fracture Instance Segmentation and Classification from CT on the RibFrac Challenge |
Jiancheng Yang et.al. |
2402.09372v1 |
null |
2024-02-14 |
Magic-Me: Identity-Specific Video Customized Diffusion |
Ze Ma et.al. |
2402.09368v1 |
null |
2024-02-14 |
Small instanton-induced flavor invariants and the axion potential |
Ravneet Bedi et.al. |
2402.09361v1 |
null |
2024-02-14 |
Pruning Sparse Tensor Neural Networks Enables Deep Learning for 3D Ultrasound Localization Microscopy |
Brice Rauby et.al. |
2402.09359v1 |
null |
2024-02-14 |
DoRA: Weight-Decomposed Low-Rank Adaptation |
Shih-Yang Liu et.al. |
2402.09353v1 |
null |
2024-02-14 |
Irreducible representations of the crystallisation of the $C^{*}$-algebra $C(SU_{q}(n+1))$ |
Manabendra Giri et.al. |
2402.09347v1 |
null |
2024-02-14 |
Registration of Longitudinal Spine CTs for Monitoring Lesion Growth |
Malika Sanhinova et.al. |
2402.09341v1 |
null |
2024-02-14 |
Stability and Multigroup Fairness in Ranking with Uncertain Predictions |
Siddartha Devic et.al. |
2402.09326v1 |
null |
2024-02-13 |
IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation |
Luke Melas-Kyriazi et.al. |
2402.08682v1 |
null |
2024-02-13 |
A Convergence Analysis of Approximate Message Passing with Non-Separable Functions and Applications to Multi-Class Classification |
Burak Çakmak et.al. |
2402.08676v1 |
null |
2024-02-13 |
Learning Emergent Gaits with Decentralized Phase Oscillators: on the role of Observations, Rewards, and Feedback |
Jenny Zhang et.al. |
2402.08662v1 |
null |
2024-02-13 |
BdSLW60: A Word-Level Bangla Sign Language Dataset |
Husne Ara Rubaiyeat et.al. |
2402.08635v1 |
link |
2024-02-13 |
Convolutional Neural Networks Towards Facial Skin Lesions Detection |
Reza Sarshar et.al. |
2402.08592v1 |
null |
2024-02-13 |
Totally geodesic submanifolds and polar actions on Stiefel manifolds |
Claudio Gorodski et.al. |
2402.08585v1 |
null |
2024-02-13 |
Motion-Adaptive Inference for Flexible Learned B-Frame Compression |
M. Akin Yilmaz et.al. |
2402.08550v1 |
null |
2024-02-13 |
Approximately Piecewise E(3) Equivariant Point Networks |
Matan Atzmon et.al. |
2402.08529v1 |
null |
2024-02-13 |
Reduced-order modeling of the dynamics of an inverted flag from experimental data |
Zhenwei Xu et.al. |
2402.08504v1 |
null |
2024-02-13 |
Intriguing Differences Between Zero-Shot and Systematic Evaluations of Vision-Language Transformer Models |
Shaeke Salman et.al. |
2402.08473v1 |
null |
2024-02-13 |
Wavefront Randomization Improves Deconvolution |
Amit Kohli et.al. |
2402.07900v2 |
null |
2024-02-12 |
Detection of Spider Mites on Labrador Beans through Machine Learning Approaches Using Custom Datasets |
Violet Liu et.al. |
2402.07895v1 |
null |
2024-02-12 |
Perfect stable regularity lemma and slice-wise stable hypergraphs |
Artem Chernikov et.al. |
2402.07870v1 |
null |
2024-02-12 |
On Computationally Efficient Multi-Class Calibration |
Parikshit Gopalan et.al. |
2402.07821v1 |
null |
2024-02-12 |
A Benchmark Grocery Dataset of Realworld Point Clouds From Single View |
Shivanand Venkanna Sheshappanavar et.al. |
2402.07819v1 |
null |
2024-02-12 |
Fixation for $\mathcal{U}$-Ising and $\mathcal{U}$-voter dynamics with frozen vertices |
Laure Marêché et.al. |
2402.07807v1 |
null |
2024-02-12 |
Estimation of non-uniform blur using a patch-based regression convolutional neural network (CNN) |
Luis G. Varela et.al. |
2402.07796v1 |
null |
2024-02-12 |
"Layer-by-layer" Unsupervised Clustering of Statistically Relevant Fluctuations in Noisy Time-series Data of Complex Dynamical Systems |
Matteo Becchi et.al. |
2402.07786v1 |
null |
2024-02-12 |
Solving parameter-dependent semi-algebraic systems |
Louis Gaillard et.al. |
2402.07782v1 |
null |
2024-02-12 |
Observations of the new meteor shower from comet 46P/Wirtanen |
D. Vida et.al. |
2402.07769v1 |
null |
2024-02-09 |
A two-stage algorithm in evolutionary product unit neural networks for classification |
Antonio J. Tallón-Ballesteros et.al. |
2402.06622v1 |
null |
2024-02-09 |
Image-based Deep Learning for the time-dependent prediction of fresh concrete properties |
Max Meyer et.al. |
2402.06611v1 |
null |
2024-02-09 |
SAE: Single Architecture Ensemble Neural Networks |
Martin Ferianc et.al. |
2402.06580v1 |
null |
2024-02-09 |
Video Annotator: A framework for efficiently building video classifiers using vision-language models and active learning |
Amir Ziai et.al. |
2402.06560v1 |
link |
2024-02-09 |
Self Supervised Learning for Improved Calibrationless Radial MRI with NLINV-Net |
Moritz Blumenthal et.al. |
2402.06550v1 |
null |
2024-02-09 |
Bryndza at ClimateActivism 2024: Stance, Target and Hate Event Detection via Retrieval-Augmented GPT-4 and LLaMA |
Marek Šuppa et.al. |
2402.06549v1 |
null |
2024-02-09 |
Feature Density Estimation for Out-of-Distribution Detection via Normalizing Flows |
Evan D. Cook et.al. |
2402.06537v1 |
null |
2024-02-09 |
Refining Myocardial Infarction Detection: A Novel Multi-Modal Composite Kernel Strategy in One-Class Classification |
Muhammad Uzair Zahid et.al. |
2402.06530v1 |
null |
2024-02-09 |
Flexible infinite-width graph convolutional networks and the importance of representation learning |
Ben Anson et.al. |
2402.06525v1 |
null |
2024-02-09 |
Dynamic swarms regulate the morphology and distribution of soft membrane domains |
Aakanksha Gubbala et.al. |
2402.06518v1 |
null |
2024-02-08 |
Classifying Nodes in Graphs without GNNs |
Daniel Winter et.al. |
2402.05934v1 |
link |
2024-02-08 |
An Interactive Agent Foundation Model |
Zane Durante et.al. |
2402.05929v1 |
null |
2024-02-08 |
Point-VOS: Pointing Up Video Object Segmentation |
Idil Esen Zulfikar et.al. |
2402.05917v1 |
null |
2024-02-08 |
A Survey on Detection, Classification, and Tracking of Aerial Threats using Radar and Communications Systems |
Wahab Khawaja et.al. |
2402.05909v1 |
null |
2024-02-09 |
Large Language Model Meets Graph Neural Network in Knowledge Distillation |
Shengxiang Hu et.al. |
2402.05894v2 |
null |
2024-02-08 |
Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data |
Shufan Li et.al. |
2402.05892v1 |
null |
2024-02-08 |
CREMA: Multimodal Compositional Video Reasoning via Efficient Modular Adaptation and Fusion |
Shoubin Yu et.al. |
2402.05889v1 |
null |
2024-02-08 |
Sandwiched Compression: Repurposing Standard Codecs with Neural Network Wrappers |
Onur G. Guleryuz et.al. |
2402.05887v1 |
link |
2024-02-08 |
GET-Tok: A GenAI-Enriched Multimodal TikTok Dataset Documenting the 2022 Attempted Coup in Peru |
Gabriela Pinto et.al. |
2402.05882v1 |
link |
2024-02-08 |
You've Got to Feel It To Believe It: Multi-Modal Bayesian Inference for Semantic and Property Prediction |
Parker Ewen et.al. |
2402.05872v1 |
null |
2024-02-07 |
Edu-ConvoKit: An Open-Source Library for Education Conversation Data |
Rose E. Wang et.al. |
2402.05111v1 |
link |
2024-02-07 |
Moduli Parameters of Complex Singularities with Non-Degenerate Newton Boundary |
Janko Boehm et.al. |
2402.05093v1 |
null |
2024-02-07 |
Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation |
Ziyang Wang et.al. |
2402.05079v1 |
link |
2024-02-07 |
Arbitrary Scale Super-Resolution Assisted Lunar Crater Detection in Satellite Images |
Atal Tewari et.al. |
2402.05068v1 |
null |
2024-02-07 |
Efficient Multi-Resolution Fusion for Remote Sensing Data with Label Uncertainty |
Hersh Vakharia et.al. |
2402.05045v1 |
link |
2024-02-07 |
PAC Learnability under Explanation-Preserving Graph Perturbations |
Xu Zheng et.al. |
2402.05039v1 |
null |
2024-02-07 |
Strong convexity-guided hyper-parameter optimization for flatter losses |
Rahul Yedida et.al. |
2402.05025v1 |
null |
2024-02-07 |
Example-based Explanations for Random Forests using Machine Unlearning |
Tanmay Surve et.al. |
2402.05007v1 |
null |
2024-02-07 |
Randomized Confidence Bounds for Stochastic Partial Monitoring |
Maxime Heuillet et.al. |
2402.05002v1 |
null |
2024-02-07 |
Beyond explaining: XAI-based Adaptive Learning with SHAP Clustering for Energy Consumption Prediction |
Tobias Clement et.al. |
2402.04982v1 |
null |
2024-02-06 |
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters |
Quan Sun et.al. |
2402.04252v1 |
link |
2024-02-06 |
The spectrum of excisive functors |
Gregory Arone et.al. |
2402.04244v1 |
null |
2024-02-06 |
A classification of nonzero skew immaculate functions |
Sarah Mason et.al. |
2402.04219v1 |
null |
2024-02-06 |
Resource-Aware Hierarchical Federated Learning in Wireless Video Caching Networks |
Md Ferdous Pervej et.al. |
2402.04216v1 |
null |
2024-02-06 |
"Task Success" is not Enough: Investigating the Use of Video-Language Models as Behavior Critics for Catching Undesirable Agent Behaviors |
Lin Guan et.al. |
2402.04210v1 |
null |
2024-02-06 |
3D Volumetric Super-Resolution in Radiology Using 3D RRDB-GAN |
Juhyung Ha et.al. |
2402.04171v1 |
null |
2024-02-06 |
Human Emotions Analysis and Recognition Using EEG Signals in Response to 360$^\circ$ Videos |
Haseeb ur Rahman Abbasi et.al. |
2402.04142v1 |
null |
2024-02-06 |
Hierarchical Delay Attribution Classification using Unstructured Text in Train Management Systems |
Anton Borg et.al. |
2402.04108v1 |
null |
2024-02-06 |
Analysis of Deep Image Prior and Exploiting Self-Guidance for Image Reconstruction |
Shijun Liang et.al. |
2402.04097v1 |
null |
2024-02-06 |
A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation |
Zhengbo Wang et.al. |
2402.04087v1 |
link |
2024-02-05 |
Multiclass Classification Procedure for Detecting Attacks on MQTT-IoT Protocol |
Hector Alaiz-Moreton et.al. |
2402.03270v1 |
null |
2024-02-05 |
Security Advice for Parents and Children About Content Filtering and Circumvention as Found on YouTube and TikTok |
Ran Elgedawy et.al. |
2402.03255v1 |
null |
2024-02-05 |
JOBSKAPE: A Framework for Generating Synthetic Job Postings to Enhance Skill Matching |
Antoine Magron et.al. |
2402.03242v1 |
link |
2024-02-05 |
FROSTER: Frozen CLIP Is A Strong Teacher for Open-Vocabulary Action Recognition |
Xiaohu Huang et.al. |
2402.03241v1 |
null |
2024-02-05 |
IGUANe: a 3D generalizable CycleGAN for multicenter harmonization of brain MR images |
Vincent Roca et.al. |
2402.03227v1 |
null |
2024-02-05 |
English Prompts are Better for NLI-based Zero-Shot Emotion Classification than Target-Language Prompts |
Patrick Barreiß et.al. |
2402.03223v1 |
null |
2024-02-05 |
"Define Your Terms" : Enhancing Efficient Offensive Speech Classification with Definition |
Huy Nghiem et.al. |
2402.03221v1 |
link |
2024-02-05 |
Isotropy, Clusters, and Classifiers |
Timothee Mickus et.al. |
2402.03191v1 |
null |
2024-02-06 |
Cool-chic video: Learned video coding with 800 parameters |
Thomas Leguay et.al. |
2402.03179v2 |
null |
2024-02-05 |
Accurate and Well-Calibrated ICD Code Assignment Through Attention Over Diverse Label Embeddings |
Gonçalo Gomes et.al. |
2402.03172v1 |
link |
2024-02-02 |
From gas to stars: MUSEings on the internal evolution of IC 1613 |
S. Taibi et.al. |
2402.01631v1 |
null |
2024-02-02 |
Truncation technique for variational quantum eigensolver for Molecular Hamiltonians |
Qidong Xu et.al. |
2402.01630v1 |
null |
2024-02-02 |
L2G2G: a Scalable Local-to-Global Network Embedding with Graph Autoencoders |
Ruikang Ouyang et.al. |
2402.01614v1 |
link |
2024-02-02 |
Immersive Video Compression using Implicit Neural Representations |
Ho Man Kwan et.al. |
2402.01596v1 |
link |
2024-02-02 |
NeuroCine: Decoding Vivid Video Sequences from Human Brain Activties |
Jingyuan Sun et.al. |
2402.01590v1 |
null |
2024-02-02 |
Boximator: Generating Rich and Controllable Motions for Video Synthesis |
Jiawei Wang et.al. |
2402.01566v1 |
null |
2024-02-02 |
Deep Continuous Networks |
Nergis Tomen et.al. |
2402.01557v1 |
link |
2024-02-02 |
SLYKLatent, a Learning Framework for Facial Features Estimation |
Samuel Adebayo et.al. |
2402.01555v1 |
null |
2024-02-02 |
Advancing Brain Tumor Inpainting with Generative Models |
Ruizhi Zhu et.al. |
2402.01509v1 |
null |
2024-02-02 |
Di-NeRF: Distributed NeRF for Collaborative Learning with Unknown Relative Poses |
Mahboubeh Asadi et.al. |
2402.01485v1 |
null |
2024-02-01 |
We're Not Using Videos Effectively: An Updated Domain Adaptive Video Segmentation Baseline |
Simar Kareer et.al. |
2402.00868v1 |
link |
2024-02-01 |
Deep Room Impulse Response Completion |
Jackie Lin et.al. |
2402.00859v1 |
null |
2024-02-01 |
Early Time Classification with Accumulated Accuracy Gap Control |
Liran Ringel et.al. |
2402.00857v1 |
link |
2024-02-01 |
BootsTAP: Bootstrapped Training for Tracking-Any-Point |
Carl Doersch et.al. |
2402.00847v1 |
link |
2024-02-01 |
Emo-Avatar: Efficient Monocular Video Style Avatar through Texture Rendering |
Pinxin Liu et.al. |
2402.00827v1 |
null |
2024-02-01 |
Examining the Influence of Digital Phantom Models in Virtual Imaging Trials for Tomographic Breast Imaging |
Amar Kavuri et.al. |
2402.00812v1 |
null |
2024-02-01 |
ReAGent: Towards A Model-agnostic Feature Attribution Method for Generative Language Models |
Zhixue Zhao et.al. |
2402.00794v1 |
link |
2024-02-01 |
Distinguishing the Indistinguishable: Human Expertise in Algorithmic Prediction |
Rohan Alur et.al. |
2402.00793v1 |
link |
2024-02-02 |
CroissantLLM: A Truly Bilingual French-English Language Model |
Manuel Faysse et.al. |
2402.00786v2 |
link |
2024-02-01 |
Hybrid Quantum Vision Transformers for Event Classification in High Energy Physics |
Eyup B. Unlu et.al. |
2402.00776v1 |
null |
2024-01-31 |
Classification-Oriented Semantic Wireless Communications |
Emrecan Kutay et.al. |
2401.18069v1 |
null |
2024-01-31 |
Rank Supervised Contrastive Learning for Time Series Classification |
Qianying Ren et.al. |
2401.18057v1 |
null |
2024-01-31 |
Variable selection for Naïve Bayes classification |
Rafael Blanquero et.al. |
2401.18039v1 |
null |
2024-01-31 |
Optimizing contrastive learning for cortical folding pattern detection |
Aymeric Gaudin et.al. |
2401.18035v1 |
null |
2024-01-31 |
A Neural Enhancement Post-Processor with a Dynamic AV1 Encoder Configuration Strategy for CLIC 2024 |
Darren Ramsook et.al. |
2401.18021v1 |
null |
2024-01-31 |
EEG-GPT: Exploring Capabilities of Large Language Models for EEG Classification and Interpretation |
Jonathan W. Kim et.al. |
2401.18006v1 |
null |
2024-01-31 |
Unsupervised Learning of Topological Non-Abelian Braiding in Non-Hermitian Bands |
Yang Long et.al. |
2401.17968v1 |
null |
2024-01-31 |
Error-Tolerant E-Discovery Protocols |
Jinshuo Dong et.al. |
2401.17952v1 |
null |
2024-01-31 |
HyperZ$\cdot$Z$\cdot$W Operator Connects Slow-Fast Networks for Full Context Interaction |
Harvie Zhang et.al. |
2401.17948v1 |
null |
2024-01-31 |
Probabilistic Photonic Computing with Chaotic Light |
Frank Brückerhoff-Plückelmann et.al. |
2401.17915v1 |
null |
2024-01-30 |
The SRG/eROSITA all-sky survey: Hard X-ray selected Active Galactic Nuclei |
Sophia G. H. Waddell et.al. |
2401.17306v1 |
null |
2024-01-30 |
Compact white-dwarf binaries in the combined SRG/eROSITA/SDSS eFEDS survey |
A. Schwope et.al. |
2401.17304v1 |
null |
2024-01-30 |
Searching for X-ray counterparts of unassociated Fermi-LAT sources and rotation-powered pulsars with SRG/eROSITA |
Martin G. F. Mayer et.al. |
2401.17295v1 |
null |
2024-01-30 |
X-ray AGNs with SRG/eROSITA: Multi-wavelength observations reveal merger triggering and post-coalescence circumnuclear blowout |
Robert W. Bickley et.al. |
2401.17277v1 |
null |
2024-01-30 |
ReacLLaMA: Merging chemical and textual information in chemical reactivity AI models |
Aline Hartgers et.al. |
2401.17267v1 |
null |
2024-01-30 |
SLIC: A Learned Image Codec Using Structure and Color |
Srivatsa Prativadibhayankaram et.al. |
2401.17246v1 |
link |
2024-01-31 |
Faster coloring and embedding in dense hypergraphs via stability |
Jianfeng Hou et.al. |
2401.17219v2 |
null |
2024-01-31 |
GazeGPT: Augmenting Human Capabilities using Gaze-contingent Contextual AI for Smart Eyewear |
Robert Konrad et.al. |
2401.17217v2 |
null |
2024-01-30 |
Single Word Change is All You Need: Designing Attacks and Defenses for Text Classifiers |
Lei Xu et.al. |
2401.17196v1 |
null |
2024-01-30 |
GraphViz2Vec: A Structure-aware Feature Generation Model to Improve Classification in GNNs |
Shraban Kumar Chatterjee et.al. |
2401.17178v1 |
null |
2024-01-29 |
Computer Vision for Primate Behavior Analysis in the Wild |
Richard Vogg et.al. |
2401.16424v1 |
null |
2024-01-29 |
Synchformer: Efficient Synchronization from Sparse Cues |
Vladimir Iashin et.al. |
2401.16423v1 |
null |
2024-01-29 |
Strategic Usage in a Multi-Learner Setting |
Eliot Shekhtman et.al. |
2401.16422v1 |
null |
2024-01-29 |
ReTaSA: A Nonparametric Functional Estimation Approach for Addressing Continuous Target Shift |
Hwanwoo Kim et.al. |
2401.16410v1 |
null |
2024-01-29 |
Is K-fold cross validation the best model selection method for Machine Learning? |
Juan M Gorriz et.al. |
2401.16407v1 |
null |
2024-01-29 |
Zero-shot Imitation Policy via Search in Demonstration Dataset |
Federco Malato et.al. |
2401.16398v1 |
null |
2024-01-29 |
Ovarian Cancer Diagnostics using Wavelet Packet Scaling Descriptors |
Raymond J. Hinton Jr. et.al. |
2401.16396v1 |
null |
2024-01-29 |
Evaluation of pseudo-healthy image reconstruction for anomaly detection with deep generative models: Application to brain FDG PET |
Ravi Hassanaly et.al. |
2401.16363v1 |
link |
2024-01-29 |
Curriculum-Based Reinforcement Learning for Quadrupedal Jumping: A Reference-free Design |
Vassil Atanassov et.al. |
2401.16337v1 |
null |
2024-01-29 |
Making the unmodulated Pyramid wavefront sensor smart. Closed-loop demonstration of neural network wavefront reconstruction with MagAO-X |
Rico Landman et.al. |
2401.16325v1 |
null |
2024-01-26 |
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities |
Chaochao Lu et.al. |
2401.15071v1 |
null |
2024-01-26 |
Deep learning-based approach for tomato classification in complex scenes |
Mikael A. Mousse et.al. |
2401.15055v1 |
null |
2024-01-26 |
Non-Unitary $3 \times 3$ Mixing in Majorana Neutrinos and Vector-like Quark Models |
Pedro M. F. Pereira et.al. |
2401.15049v1 |
null |
2024-01-26 |
Machine learning-based analysis of glioma tissue sections: a review |
Jan-Philipp Redlich et.al. |
2401.15022v1 |
null |
2024-01-26 |
Enhancement of a Text-Independent Speaker Verification System by using Feature Combination and Parallel-Structure Classifiers |
Kerlos Atia Abdalmalak et.al. |
2401.15018v1 |
null |
2024-01-26 |
Graph-based Active Learning for Entity Cluster Repair |
Victor Christen et.al. |
2401.14992v1 |
null |
2024-01-26 |
Stokes graphs of the Rabi problem with real parameters |
René Langøen et.al. |
2401.14991v1 |
null |
2024-01-26 |
Minimum-dissipation principle for synchronised stochastic oscillators far from equilibrium |
Jan Meibohm et.al. |
2401.14982v1 |
null |
2024-01-26 |
Microwave lymphedema assessment using deep learning with contour assisted backprojection |
Yuyi Chang et.al. |
2401.14970v1 |
null |
2024-01-26 |
Hold Tight: Identifying Behavioral Patterns During Prolonged Work in VR through Video Analysis |
Verena Biener et.al. |
2401.14920v1 |
null |
2024-01-25 |
Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities |
Yiyuan Zhang et.al. |
2401.14405v1 |
link |
2024-01-25 |
Adaptive Mobile Manipulation for Articulated Objects In the Open World |
Haoyu Xiong et.al. |
2401.14403v1 |
null |
2024-01-25 |
Range-Agnostic Multi-View Depth Estimation With Keyframe Selection |
Andrea Conti et.al. |
2401.14401v1 |
link |
2024-01-25 |
Rethinking Patch Dependence for Masked Autoencoders |
Letian Fu et.al. |
2401.14391v1 |
null |
2024-01-25 |
Smooth Ranking SVM via Cutting-Plane Method |
Erhan Can Ozcan et.al. |
2401.14388v1 |
link |
2024-01-25 |
Inconsistency Masks: Removing the Uncertainty from Input-Pseudo-Label Pairs |
Michael R. H. Vorndran et.al. |
2401.14387v1 |
link |
2024-01-25 |
A Comparative Analysis of Noise Reduction Methods in Sentiment Analysis on Noisy Bengali Texts |
Kazi Toufique Elahi et.al. |
2401.14360v1 |
link |
2024-01-25 |
Computing Derivations on Nilpotent Quadratic Lie Algebras |
Pilar Benito et.al. |
2401.14348v1 |
null |
2024-01-25 |
Class-attribute Priors: Adapting Optimization to Heterogeneity and Fairness Objective |
Xuechen Zhang et.al. |
2401.14343v1 |
null |
2024-01-25 |
Progressive Multi-task Anti-Noise Learning and Distilling Frameworks for Fine-grained Vehicle Recognition |
Dichao Liu et.al. |
2401.14336v1 |
link |
2024-01-24 |
Tyche: Stochastic In-Context Learning for Medical Image Segmentation |
Marianne Rakic et.al. |
2401.13650v1 |
null |
2024-01-24 |
Quantifying the Impact of Frame Preemption on Combined TSN Shapers |
Rubi Debnath et.al. |
2401.13631v1 |
null |
2024-01-24 |
Can overfitted deep neural networks in adversarial training generalize? -- An approximation viewpoint |
Zhongjie Shi et.al. |
2401.13624v1 |
null |
2024-01-24 |
FLLIC: Functionally Lossless Image Compression |
Xi Zhang et.al. |
2401.13616v1 |
null |
2024-01-24 |
Enhancing Image Retrieval : A Comprehensive Study on Photo Search using the CLIP Mode |
Naresh Kumar Lahajal et.al. |
2401.13613v1 |
null |
2024-01-24 |
Prompt Weight Experiments for LLM Instruction Fine-Tuning |
Mathew Huerta-Enochian et.al. |
2401.13586v1 |
null |
2024-01-24 |
WPDA: Frequency-based Backdoor Attack with Wavelet Packet Decomposition |
Zhengyao Song et.al. |
2401.13578v1 |
null |
2024-01-24 |
CNN architecture extraction on edge GPU |
Peter Horvath et.al. |
2401.13575v1 |
null |
2024-01-24 |
Benchmarking the Fairness of Image Upsampling Methods |
Mike Laszkiewicz et.al. |
2401.13555v1 |
null |
2024-01-24 |
PanAf20K: A Large Video Dataset for Wild Ape Detection and Behaviour Recognition |
Otto Brookes et.al. |
2401.13554v1 |
null |
2024-01-23 |
SegmentAnyBone: A Universal Model that Segments Any Bone at Any Location on MRI |
Hanxue Gu et.al. |
2401.12974v1 |
null |
2024-01-23 |
On the Efficacy of Text-Based Input Modalities for Action Anticipation |
Apoorva Beedu et.al. |
2401.12972v1 |
null |
2024-01-23 |
The role of environment and AGN feedback in quenching local galaxies: Comparing cosmological hydrodynamical simulations to the SDSS |
Paul H. Goubert et.al. |
2401.12953v1 |
null |
2024-01-23 |
Lumiere: A Space-Time Diffusion Model for Video Generation |
Omer Bar-Tal et.al. |
2401.12945v1 |
null |
2024-01-23 |
Long-range three-dimensional tracking of nanoparticles using interferometric scattering (iSCAT) microscopy |
Kiarash Kasaian et.al. |
2401.12939v1 |
null |
2024-01-23 |
Neural deformation fields for template-based reconstruction of cortical surfaces from MRI |
Fabian Bongratz et.al. |
2401.12938v1 |
null |
2024-01-23 |
Segmentation of tibiofemoral joint tissues from knee MRI using MtRA-Unet and incorporating shape information: Data from the Osteoarthritis Initiative |
Akshay Daydar et.al. |
2401.12932v1 |
null |
2024-01-23 |
pyAKI - An Open Source Solution to Automated KDIGO classification |
Christian Porschen et.al. |
2401.12930v1 |
null |
2024-01-23 |
Performance Analysis of Support Vector Machine (SVM) on Challenging Datasets for Forest Fire Detection |
Ankan Kar et.al. |
2401.12924v1 |
null |
2024-01-23 |
Advancing Glitch Classification in Gravity Spy: Multi-view Fusion with Attention-based Machine Learning for Advanced LIGO's Fourth Observing Run |
Yunan Wu et.al. |
2401.12913v1 |
null |
2024-01-22 |
Connecting the Dots: Leveraging Spatio-Temporal Graph Neural Networks for Accurate Bangla Sign Language Recognition |
Haz Sameen Shahgir et.al. |
2401.12210v1 |
null |
2024-01-22 |
Unsupervised Machine Learning for the Classification of Astrophysical X-ray Sources |
Víctor Samuel Pérez-Díaz et.al. |
2401.12203v1 |
link |
2024-01-22 |
OK-Robot: What Really Matters in Integrating Open-Knowledge Models for Robotics |
Peiqi Liu et.al. |
2401.12202v1 |
null |
2024-01-22 |
In-Context Learning for Extreme Multi-Label Classification |
Karel D'Oosterlinck et.al. |
2401.12178v1 |
null |
2024-01-22 |
Broiler-Net: A Deep Convolutional Framework for Broiler Behavior Analysis in Poultry Houses |
Tahereh Zarrat Ehsan et.al. |
2401.12176v1 |
link |
2024-01-22 |
VRMN-bD: A Multi-modal Natural Behavior Dataset of Immersive Human Fear Responses in VR Stand-up Interactive Games |
He Zhang et.al. |
2401.12133v1 |
link |
2024-01-22 |
Evaluation of QCNN-LSTM for Disability Forecasting in Multiple Sclerosis Using Sequential Multisequence MRI |
John D. Mayfield et.al. |
2401.12132v1 |
null |
2024-01-22 |
Out-of-Distribution Detection & Applications With Ablated Learned Temperature Energy |
Will LeVine et.al. |
2401.12129v1 |
link |
2024-01-22 |
Measures of the Capital Network of the U.S. Economy |
Ben Klemens et.al. |
2401.12118v1 |
null |
2024-01-22 |
A quantitative version of the Steinhaus theorem |
Alex Iosevich et.al. |
2401.12112v1 |
null |
2024-01-19 |
Classifying affine structures with focus-focus singularities |
Xiudi Tang et.al. |
2401.10881v1 |
null |
2024-01-19 |
Motion Consistency Loss for Monocular Visual Odometry with Attention-Based Deep Learning |
André O. Françani et.al. |
2401.10857v1 |
null |
2024-01-19 |
Emotion Classification In Software Engineering Texts: A Comparative Analysis of Pre-trained Transformers Language Models |
Mia Mohammad Imran et.al. |
2401.10845v1 |
null |
2024-01-19 |
Understanding Video Transformers via Universal Concept Discovery |
Matthew Kowal et.al. |
2401.10831v1 |
null |
2024-01-19 |
Long-Term Monitoring of the Oe Star VES 735: Ope! Not So Quiet After All |
Brandon Marshall et.al. |
2401.10829v1 |
null |
2024-01-19 |
ActAnywhere: Subject-Aware Video Background Generation |
Boxiao Pan et.al. |
2401.10822v1 |
null |
2024-01-19 |
RAD-DINO: Exploring Scalable Medical Image Encoders Beyond Text Supervision |
Fernando Pérez-García et.al. |
2401.10815v1 |
null |
2024-01-19 |
Learning to Visually Connect Actions and their Effects |
Eric Peh et.al. |
2401.10805v1 |
null |
2024-01-19 |
Endovascular Detection of Catheter-Thrombus Contact by Vacuum Excitation |
Jared Lawson et.al. |
2401.10804v1 |
null |
2024-01-19 |
TDC-less Direct Time-of-Flight Imaging Using Spiking Neural Networks |
Jack MacLean et.al. |
2401.10793v1 |
null |
2024-01-18 |
Simultaneous Tactile Estimation and Control for Extrinsic Dexterity |
Antonia Bronars et.al. |
2401.10230v1 |
null |
2024-01-18 |
OMG-Seg: Is One Model Good Enough For All Segmentation? |
Xiangtai Li et.al. |
2401.10229v1 |
link |
2024-01-18 |
RAP-SAM: Towards Real-Time All-Purpose Segment Anything |
Shilin Xu et.al. |
2401.10228v1 |
link |
2024-01-18 |
Towards Language-Driven Video Inpainting via Multimodal Large Language Models |
Jianzong Wu et.al. |
2401.10226v1 |
null |
2024-01-18 |
Explaining the Implicit Neural Canvas: Connecting Pixels to Neurons by Tracing their Contributions |
Namitha Padmanabhan et.al. |
2401.10217v1 |
null |
2024-01-18 |
Transfer Learning in Human Activity Recognition: A Survey |
Sourish Gunesh Dhekane et.al. |
2401.10185v1 |
null |
2024-01-18 |
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild |
Andreas Engelhardt et.al. |
2401.10171v1 |
null |
2024-01-19 |
Motion-Zero: Zero-Shot Moving Object Control Framework for Diffusion-Based Video Generation |
Changgu Chen et.al. |
2401.10150v2 |
null |
2024-01-18 |
Few-shot learning for COVID-19 Chest X-Ray Classification with Imbalanced Data: An Inter vs. Intra Domain Study |
Alejandro Galán-Cuenca et.al. |
2401.10129v1 |
null |
2024-01-18 |
Sub2Full: split spectrum to boost OCT despeckling without clean data |
Lingyun Wang et.al. |
2401.10128v1 |
link |
2024-01-17 |
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model |
Lianghui Zhu et.al. |
2401.09417v1 |
link |
2024-01-17 |
Vlogger: Make Your Dream A Vlog |
Shaobin Zhuang et.al. |
2401.09414v1 |
link |
2024-01-17 |
Deciphering Textual Authenticity: A Generalized Strategy through the Lens of Large Language Semantics for Detecting Human vs. Machine-Generated Text |
Mazal Bethany et.al. |
2401.09407v1 |
null |
2024-01-17 |
Élivágar: Efficient Quantum Circuit Search for Classification |
Sashwat Anagolum et.al. |
2401.09393v1 |
null |
2024-01-17 |
Tri$^{2}$-plane: Volumetric Avatar Reconstruction with Feature Pyramid |
Luchuan Song et.al. |
2401.09386v1 |
link |
2024-01-17 |
New relations of pod partition and its connection with other partition functions |
Hemjyoti Nath et.al. |
2401.09374v1 |
null |
2024-01-17 |
To deform or not: treatment-aware longitudinal registration for breast DCE-MRI during neoadjuvant chemotherapy via unsupervised keypoints detection |
Luyi Han et.al. |
2401.09336v1 |
link |
2024-01-17 |
Machines Do See Color: A Guideline to Classify Different Forms of Racist Discourse in Large Corpora |
Diana Davila Gordillo et.al. |
2401.09333v1 |
null |
2024-01-17 |
Spectral Distribution Complexity of the Surface Fibrillatory Waves Predicts Post-Catheter Ablation Relapse in Persistent Atrial Fibrillation |
Pilar Escribano et.al. |
2401.09297v1 |
null |
2024-01-17 |
T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis |
Yoonjin Chung et.al. |
2401.09294v1 |
null |
2024-01-16 |
From Coarse to Fine: Efficient Training for Audio Spectrogram Transformers |
Jiu Feng et.al. |
2401.08415v1 |
null |
2024-01-16 |
Faster ISNet for Background Bias Mitigation on Deep Neural Networks |
Pedro R. A. S. Bassi et.al. |
2401.08409v1 |
null |
2024-01-16 |
Training and Comparison of nnU-Net and DeepMedic Methods for Autosegmentation of Pediatric Brain Tumors |
Arastoo Vossough et.al. |
2401.08404v1 |
null |
2024-01-16 |
High-Quality Mesh Blendshape Generation from Face Videos via Neural Inverse Rendering |
Xin Ming et.al. |
2401.08398v1 |
null |
2024-01-16 |
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models |
Zongxin Yang et.al. |
2401.08392v1 |
link |
2024-01-16 |
We don't need no labels: Estimating post-deployment model performance under covariate shift without ground truth |
Jakub Białek et.al. |
2401.08348v1 |
null |
2024-01-16 |
Learn What You Need in Personalized Federated Learning |
Kexin Lv et.al. |
2401.08327v1 |
link |
2024-01-16 |
Application of LLM Agents in Recruitment: A Novel Framework for Resume Screening |
Chengguang Gan et.al. |
2401.08315v1 |
null |
2024-01-16 |
Central extensions of restricted Lie superalgebras and classification of $p$-nilpotent Lie superalgebras in dimension $4$ |
Sofiane Bouarroudj et.al. |
2401.08313v1 |
null |
2024-01-16 |
Evaluating online elasticity estimation of soft objects using standard robot grippers |
Shubhan P. Patni et.al. |
2401.08298v1 |
null |
2024-01-16 |
Multitask Learning in Minimally Invasive Surgical Vision: A Review |
Oluwatosin Alabi et.al. |
2401.08256v1 |
null |
2024-01-16 |
Multi-scale 2D Temporal Map Diffusion Models for Natural Language Video Localization |
Chongzhi Zhang et.al. |
2401.08232v1 |
null |
2024-01-16 |
Towards Causal Relationship in Indefinite Data: Baseline Model and New Datasets |
Hang Chen et.al. |
2401.08221v1 |
link |
2024-01-16 |
Ship Detection in SAR Images with Human-in-the-Loop |
Hecheng Jia et.al. |
2401.08213v1 |
null |
2024-01-16 |
ModelNet-O: A Large-Scale Synthetic Dataset for Occlusion-Aware Point Cloud Classification |
Zhongbin Fang et.al. |
2401.08210v1 |
link |
2024-01-12 |
Mind Your Format: Towards Consistent Evaluation of In-Context Learning Improvements |
Anton Voronov et.al. |
2401.06766v1 |
null |
2024-01-12 |
Classification of singularities of cluster algebras of finite type II: coefficients |
Angélica Benito et.al. |
2401.06758v1 |
null |
2024-01-12 |
Synthetic Data Generation Framework, Dataset, and Efficient Deep Model for Pedestrian Intention Prediction |
Muhammad Naveed Riaz et.al. |
2401.06757v1 |
null |
2024-01-12 |
Stylometry Analysis of Multi-authored Documents for Authorship and Author Style Change Detection |
Muhammad Tayyab Zamir et.al. |
2401.06752v1 |
null |
2024-01-12 |
Efficient Parallel Algorithms for Inpainting-Based Representations of 4K Images -- Part II: Spatial and Tonal Data Optimization |
Niklas Kämper et.al. |
2401.06747v1 |
null |
2024-01-12 |
Efficient Parallel Algorithms for Inpainting-Based Representations of 4K Images -- Part I: Homogeneous Diffusion Inpainting |
Niklas Kämper et.al. |
2401.06744v1 |
null |
2024-01-12 |
Complexity Classification of Product State Problems for Local Hamiltonians |
John Kallaugher et.al. |
2401.06725v1 |
null |
2024-01-12 |
Obstacle-Aware Positioning of a Mobile Robotic Platform for 6G Networks |
Alexandre Costa et.al. |
2401.06717v1 |
null |
2024-01-12 |
Reliability Analysis of Psychological Concept Extraction and Classification in User-penned Text |
Muskan Garg et.al. |
2401.06709v1 |
null |
2024-01-12 |
On the existence of charged electrostatic black holes in arbitrary topology |
Martin Reiris et.al. |
2401.06702v1 |
null |
2024-01-11 |
Distilling Vision-Language Models on Millions of Videos |
Yue Zhao et.al. |
2401.06129v1 |
null |
2024-01-11 |
Dubbing for Everyone: Data-Efficient Visual Dubbing using Neural Rendering Priors |
Jack Saunders et.al. |
2401.06126v1 |
null |
2024-01-11 |
Gaussian Shadow Casting for Neural Characters |
Luis Bolanos et.al. |
2401.06116v1 |
null |
2024-01-11 |
A Closer Look at AUROC and AUPRC under Class Imbalance |
Matthew B. A. McDermott et.al. |
2401.06091v1 |
link |
2024-01-12 |
LEGO:Language Enhanced Multi-modal Grounding Model |
Zhaowei Li et.al. |
2401.06071v2 |
link |
2024-01-11 |
On the Power of Graph Neural Networks and Feature Augmentation Strategies to Classify Social Networks |
Walid Guettala et.al. |
2401.06048v1 |
null |
2024-01-11 |
RAVEN: Rethinking Adversarial Video Generation with Efficient Tri-plane Networks |
Partha Ghosh et.al. |
2401.06035v1 |
null |
2024-01-11 |
Attention to detail: inter-resolution knowledge distillation |
Rocío del Amor et.al. |
2401.06010v1 |
link |
2024-01-11 |
Sea ice detection using concurrent multispectral and synthetic aperture radar imagery |
Martin S J Rogers et.al. |
2401.06009v1 |
null |
2024-01-11 |
Boosting Mixed-Initiative Co-Creativity in Game Design: A Tutorial |
Solange Margarido et.al. |
2401.05999v1 |
null |
2024-01-10 |
Towards Online Sign Language Recognition and Translation |
Ronglai Zuo et.al. |
2401.05336v1 |
link |
2024-01-10 |
ANIM-400K: A Large-Scale Dataset for Automated End-To-End Dubbing of Video |
Kevin Cai et.al. |
2401.05314v1 |
link |
2024-01-10 |
Strategic Client Selection to Address Non-IIDness in HAPS-enabled FL Networks |
Amin Farajzadeh et.al. |
2401.05308v1 |
null |
2024-01-10 |
Frame-like Fourier expansions for finite Borel measures on $\mathbb{R}$ |
Chad Berner et.al. |
2401.05243v1 |
null |
2024-01-10 |
Learning effective good variables from physical data |
Giulio Barletta et.al. |
2401.05226v1 |
link |
2024-01-10 |
TOVAC: Tele-operated Vehicle Admission Control and Routing |
Jorge Martín-Pérez et.al. |
2401.05225v1 |
null |
2024-01-10 |
Do Vision and Language Encoders Represent the World Similarly? |
Mayug Maniparambil et.al. |
2401.05224v1 |
null |
2024-01-10 |
Exploring Vulnerabilities of No-Reference Image Quality Assessment Models: A Query-Based Black-Box Method |
Chenxi Yang et.al. |
2401.05217v1 |
null |
2024-01-10 |
Pre-trained Large Language Models for Financial Sentiment Analysis |
Wei Luo et.al. |
2401.05215v1 |
link |
2024-01-10 |
A Novel Prompt-tuning Method: Incorporating Scenario-specific Concepts into a Verbalizer |
Yong Ma et.al. |
2401.05204v1 |
null |
2024-01-09 |
A Simple Baseline for Spoken Language to Sign Language Translation with 3D Avatars |
Ronglai Zuo et.al. |
2401.04730v1 |
link |
2024-01-09 |
U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation |
Jun Ma et.al. |
2401.04722v1 |
null |
2024-01-09 |
Helicoidal surfaces of prescribed mean curvature in $\mathbb{R}^3$ |
Aires Eduardo Menani Barbieri et.al. |
2401.04721v1 |
null |
2024-01-09 |
Low-resource finetuning of foundation models beats state-of-the-art in histopathology |
Benedikt Roth et.al. |
2401.04720v1 |
null |
2024-01-09 |
Jump Cut Smoothing for Talking Heads |
Xiaojuan Wang et.al. |
2401.04718v1 |
null |
2024-01-09 |
NIPn CHIPS |
Blaise Boissonneau et.al. |
2401.04697v1 |
null |
2024-01-09 |
CoordGate: Efficiently Computing Spatially-Varying Convolutions in Convolutional Neural Networks |
Sunny Howard et.al. |
2401.04680v1 |
null |
2024-01-09 |
Benchmark Analysis of Various Pre-trained Deep Learning Models on ASSIRA Cats and Dogs Dataset |
Galib Muhammad Shahriar Himel et.al. |
2401.04666v1 |
null |
2024-01-09 |
DepressionEmo: A novel dataset for multilabel classification of depression emotions |
Abu Bakar Siddiqur Rahman et.al. |
2401.04655v1 |
link |
2024-01-09 |
Hold 'em and Fold 'em: Towards Human-scale, Feedback-Controlled Soft Origami Robots |
Immanuel Ampomah Mensah et.al. |
2401.04650v1 |
null |
2024-01-08 |
Dr$^2$Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning |
Chen Zhao et.al. |
2401.04105v1 |
null |
2024-01-08 |
RudolfV: A Foundation Model by Pathologists for Pathologists |
Jonas Dippel et.al. |
2401.04079v1 |
null |
2024-01-08 |
Variance Reduction in Ratio Metrics for Efficient Online Experiments |
Shubham Baweja et.al. |
2401.04062v1 |
null |
2024-01-08 |
Bjøntegaard Delta (BD): A Tutorial Overview of the Metric, Evolution, Challenges, and Recommendations |
Nabajeet Barman et.al. |
2401.04039v1 |
null |
2024-01-08 |
Blocks whose defect groups are Suzuki $2$-groups |
Charles W. Eaton et.al. |
2401.04028v1 |
null |
2024-01-08 |
IDoFew: Intermediate Training Using Dual-Clustering in Language Models for Few Labels Text Classification |
Abdullah Alsuhaibani et.al. |
2401.04025v1 |
null |
2024-01-08 |
Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video Classification |
Wentao Zhu et.al. |
2401.04023v1 |
null |
2024-01-08 |
Resident space object detection method based on the connection between Fourier spectrum of the video data difference frame and the linear velocity projection |
V. S. Baranova et.al. |
2401.04021v1 |
null |
2024-01-09 |
Recognizing Blazars Using Radio Morphology from the VLA Sky Survey |
Zhang-Liang Xie et.al. |
2401.04009v2 |
null |
2024-01-08 |
Calabi-Yau Varieties via Cyclic Covers, and Complex Hyperbolic Structures for their Moduli Spaces |
Chenglong Yu et.al. |
2401.04006v1 |
null |
2024-01-05 |
Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively |
Haobo Yuan et.al. |
2401.02955v1 |
link |
2024-01-05 |
The Dark Energy Survey Supernova Program: Cosmological Analysis and Systematic Uncertainties |
M. Vincenzi et.al. |
2401.02945v1 |
null |
2024-01-05 |
Digital-analog quantum learning on Rydberg atom arrays |
Jonathan Z. Lu et.al. |
2401.02940v1 |
null |
2024-01-05 |
Mixing Magnetic and Electric Ehlers-Harrison transformations: The Electromagnetic Swirling Spacetime and Novel Type I Backgrounds |
José Barrientos et.al. |
2401.02924v1 |
null |
2024-01-05 |
Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks |
Kevin Everson et.al. |
2401.02921v1 |
null |
2024-01-05 |
Analytically-Driven Resource Management for Cloud-Native Microservices |
Yanqi Zhang et.al. |
2401.02920v1 |
null |
2024-01-05 |
Introducing Bode: A Fine-Tuned Large Language Model for Portuguese Prompt-Based Task |
Gabriel Lino Garcia et.al. |
2401.02909v1 |
null |
2024-01-05 |
Robust Bichromatic Classification using Two Lines |
Erwin Glazenburg et.al. |
2401.02897v1 |
null |
2024-01-05 |
Particle-Wise Higher-Order SPH Field Approximation for DVR |
Jonathan Fischer et.al. |
2401.02896v1 |
null |
2024-01-05 |
Nonlinear functional regression by functional deep neural network with kernel embedding |
Zhongjie Shi et.al. |
2401.02890v1 |
null |
2024-01-04 |
asimulation: Domain formation and impact on observables in resolved cosmological simulations of the (a)symmetron |
Øyvind Christiansen et.al. |
2401.02410v1 |
link |
2024-01-04 |
Gravitational waves from dark domain walls |
Øyvind Christiansen et.al. |
2401.02409v1 |
link |
2024-01-05 |
Correctness Comparison of ChatGPT-4, Bard, Claude-2, and Copilot for Spatial Tasks |
Hartwig H. Hochmair et.al. |
2401.02404v2 |
null |
2024-01-04 |
3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language Distillation |
Zihao Xiao et.al. |
2401.02402v1 |
null |
2024-01-04 |
Analyzing Misinformation Claims During the 2022 Brazilian General Election on WhatsApp, Twitter, and Kwai |
Scott A. Hale et.al. |
2401.02395v1 |
null |
2024-01-04 |
Image denoising and model-independent parameterization for improving IVIM MRI |
Caleb Sample et.al. |
2401.02394v1 |
null |
2024-01-04 |
Survey of 3D Human Body Pose and Shape Estimation Methods for Contemporary Dance Applications |
Darshan Venkatrayappa et.al. |
2401.02383v1 |
null |
2024-01-04 |
A novel method to enhance pneumonia detection via a model-level ensembling of CNN and vision transformer |
Sandeep Angara et.al. |
2401.02358v1 |
null |
2024-01-04 |
ClassWise-SAM-Adapter: Parameter Efficient Fine-tuning Adapts Segment Anything to SAR Domain for Semantic Segmentation |
Xinyang Pu et.al. |
2401.02326v1 |
link |
2024-01-04 |
Reflection physics in X-ray-emitting Symbiotic Stars |
Jesús A. Toalá et.al. |
2401.02318v1 |
null |
2024-01-03 |
Profinite equivariant spectra and their tensor-triangular geometry |
Scott Balchin et.al. |
2401.01878v1 |
null |
2024-01-03 |
A spatial mixture model for spaceborne lidar observations over mixed forest and non-forest land types |
Paul B. May et.al. |
2401.01848v1 |
null |
2024-01-03 |
Teaching with a companion: the case of gravity |
Iuliia Zhurakovskaia et.al. |
2401.01832v1 |
null |
2024-01-03 |
Iterative Mask Filling: An Effective Text Augmentation Method Using Masked Language Modeling |
Himmet Toprak Kesgin et.al. |
2401.01830v1 |
null |
2024-01-03 |
Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions |
David Junhao Zhang et.al. |
2401.01827v1 |
link |
2024-01-03 |
Detours for Navigating Instructional Videos |
Kumar Ashutosh et.al. |
2401.01823v1 |
null |
2024-01-03 |
SENS3: Multisensory Database of Finger-Surface Interactions and Corresponding Sensations |
Jagan K. Balasubramanian et.al. |
2401.01818v1 |
null |
2024-01-03 |
Signal Processing in the Retina: Interpretable Graph Classifier to Predict Ganglion Cell Responses |
Yasaman Parhizkar et.al. |
2401.01813v1 |
null |
2024-01-03 |
Efficient Computation of Confidence Sets Using Classification on Equidistributed Grids |
Lujie Zhou et.al. |
2401.01804v1 |
null |
2024-01-03 |
An experimental sorting method for improving metagenomic data encoding |
Diogo Pratas et.al. |
2401.01786v1 |
null |
2024-01-02 |
Street Gaussians for Modeling Dynamic Urban Scenes |
Yunzhi Yan et.al. |
2401.01339v1 |
null |
2024-01-02 |
Classifying Words with 3-sort Automata |
Tomasz Jastrząb et.al. |
2401.01314v1 |
null |
2024-01-03 |
A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models |
S. M Towhidul Islam Tonmoy et.al. |
2401.01313v2 |
null |
2024-01-02 |
Integrating Edges into U-Net Models with Explainable Activation Maps for Brain Tumor Segmentation using MR Images |
Subin Sahayam et.al. |
2401.01303v1 |
null |
2024-01-02 |
$f$-Divergence Based Classification: Beyond the Use of Cross-Entropy |
Nicola Novello et.al. |
2401.01268v1 |
link |
2024-01-02 |
VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM |
Fuchen Long et.al. |
2401.01256v1 |
null |
2024-01-02 |
An operational approach to classifying measurement incompatibility |
Arun Kumar Das et.al. |
2401.01236v1 |
null |
2024-01-03 |
Distribution Matching for Multi-Task Learning of Classification Tasks: a Large-Scale Study on Faces & Beyond |
Dimitrios Kollias et.al. |
2401.01219v2 |
null |
2024-01-02 |
FGENet: Fine-Grained Extraction Network for Congested Crowd Counting |
Hao-Yuan Ma et.al. |
2401.01208v1 |
null |
2024-01-02 |
Whole-examination AI estimation of fetal biometrics from 20-week ultrasound scans |
Lorenzo Venturini et.al. |
2401.01201v1 |
null |
2023-12-29 |
Computational Tools for Trees in Gauge Theory and Gravity |
Jacob L. Bourjaily et.al. |
2312.17745v1 |
null |
2023-12-29 |
Multiscale Vision Transformers meet Bipartite Matching for efficient single-stage Action Localization |
Ioanna Ntinou et.al. |
2312.17686v1 |
null |
2023-12-29 |
Malware Detection in IOT Systems Using Machine Learning Techniques |
Ali Mehrban et.al. |
2312.17683v1 |
null |
2023-12-29 |
FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis |
Feng Liang et.al. |
2312.17681v1 |
null |
2023-12-29 |
Grasping, Part Identification, and Pose Refinement in One Shot with a Tactile Gripper |
Joyce Xin-Yan Lim et.al. |
2312.17650v1 |
null |
2023-12-29 |
MoD2T:Model-Data-Driven Motion-Static Object Tracking Method |
Yang Feng et.al. |
2312.17641v1 |
null |
2023-12-29 |
A New Explanation of the Mechanism of Hadley Circulation |
Wei Huang et.al. |
2312.17637v1 |
null |
2023-12-29 |
Towards Faithful Explanations for Text Classification with Robustness Improvement and Explanation Guided Training |
Dongfang Li et.al. |
2312.17591v1 |
null |
2023-12-29 |
A Tool for the Procedural Generation of Shaders using Interactive Evolutionary Algorithms |
Elio Sasso et.al. |
2312.17587v1 |
link |
2023-12-29 |
Distribution-based Low-rank Embedding |
Bardia Yousefi et.al. |
2312.17579v1 |
null |
2023-12-28 |
A Simple LLM Framework for Long-Range Video Question-Answering |
Ce Zhang et.al. |
2312.17235v1 |
null |
2023-12-28 |
4DGen: Grounded 4D Content Generation with Spatial-temporal Consistency |
Yuyang Yin et.al. |
2312.17225v1 |
null |
2023-12-28 |
EFHQ: Multi-purpose ExtremePose-Face-HQ dataset |
Trung Tuan Dao et.al. |
2312.17205v1 |
null |
2023-12-28 |
One Model to Rule them All: Towards Universal Segmentation for Medical Images with Text Prompts |
Ziheng Zhao et.al. |
2312.17183v1 |
null |
2023-12-28 |
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action |
Jiasen Lu et.al. |
2312.17172v1 |
null |
2023-12-28 |
Classification of multiplication modules over multiplication rings with finitely many minimal primes |
Volodymyr Bavula et.al. |
2312.17170v1 |
null |
2023-12-28 |
Securing NextG Systems against Poisoning Attacks on Federated Learning: A Game-Theoretic Solution |
Yalin E. Sagduyu et.al. |
2312.17164v1 |
null |
2023-12-28 |
Replica Tree-based Federated Learning using Limited Data |
Ramona Ghilea et.al. |
2312.17159v1 |
null |
2023-12-29 |
ARTrackV2: Prompting Autoregressive Tracker Where to Look and How to Describe |
Yifan Bai et.al. |
2312.17133v2 |
null |
2023-12-28 |
Grounding-Prompter: Prompting LLM with Multimodal Information for Temporal Sentence Grounding in Long Videos |
Houlun Chen et.al. |
2312.17117v1 |
null |
2023-12-26 |
Microwave signal processing using an analog quantum reservoir computer |
Alen Senanian et.al. |
2312.16166v1 |
null |
2023-12-26 |
Large-scale Long-tailed Disease Diagnosis on Radiology Images |
Qiaoyu Zheng et.al. |
2312.16151v1 |
null |
2023-12-27 |
The Media Bias Taxonomy: A Systematic Literature Review on the Forms and Automated Detection of Media Bias |
Timo Spinde et.al. |
2312.16148v2 |
link |
2023-12-26 |
The non-Abelian Aharonov-Bohm effect |
P. A. Horvathy et.al. |
2312.16133v1 |
null |
2023-12-26 |
LangSplat: 3D Language Gaussian Splatting |
Minghan Qin et.al. |
2312.16084v1 |
null |
2023-12-26 |
AdaNAS: Adaptively Post-processing with Self-supervised Neural Architecture Search for Ensemble Rainfall Forecasts |
Yingpeng Wen et.al. |
2312.16046v1 |
null |
2023-12-26 |
An extended asymmetric sigmoid with Perceptron (SIGTRON) for imbalanced linear classification |
Hyenkyun Woo et.al. |
2312.16043v1 |
null |
2023-12-26 |
Multi-scale Progressive Feature Embedding for Accurate NIR-to-RGB Spectral Domain Translation |
Xingxing Yang et.al. |
2312.16040v1 |
null |
2023-12-26 |
Plug-and-Play Regularization on Magnitude with Deep Priors for 3D Near-Field MIMO Imaging |
Okyanus Oral et.al. |
2312.16024v1 |
null |
2023-12-26 |
Classification of positive solutions of Hardy-Sobolev equation without the finite volume constraints |
Lu Chen et.al. |
2312.16017v1 |
null |
2023-12-25 |
Training Convolutional Neural Networks with the Forward-Forward algorithm |
Riccardo Scodellaro et.al. |
2312.14924v2 |
null |
2023-12-22 |
DRStageNet: Deep Learning for Diabetic Retinopathy Staging from Fundus Images |
Yevgeniy Men et.al. |
2312.14891v1 |
null |
2023-12-22 |
On rate-optimal classification from non-private and from private data |
Balázs Csanád Csáji et.al. |
2312.14889v1 |
null |
2023-12-22 |
Classification of cubic tricirculant nut graphs |
Ivan Damnjanović et.al. |
2312.14884v1 |
null |
2023-12-22 |
Neural-network-based regularization methods for inverse problems in imaging |
Andreas Habring et.al. |
2312.14849v1 |
null |
2023-12-22 |
Classification of 3-GNDB Graphs |
Amir Hosseini et.al. |
2312.14835v1 |
null |
2023-12-22 |
Dreaming of Electrical Waves: Generative Modeling of Cardiac Excitation Waves using Diffusion Models |
Tanish Baranwal et.al. |
2312.14830v1 |
null |
2023-12-22 |
Classification of generalised higher-order Einstein-Maxwell Lagrangians |
Aimeric Colléaux et.al. |
2312.14814v1 |
null |
2023-12-22 |
On support vector machines under a multiple-cost scenario |
Sandra Benítez-Peña et.al. |
2312.14795v1 |
null |
2023-12-22 |
The Rate-Distortion-Perception-Classification Tradeoff: Joint Source Coding and Modulation via Inverse-Domain GANs |
Junli Fang et.al. |
2312.14792v1 |
null |
2023-12-21 |
3D Pose Estimation of Two Interacting Hands from a Monocular Event Camera |
Christen Millerdurai et.al. |
2312.14157v1 |
null |
2023-12-21 |
Virtual Pets: Animatable Animal Generation in 3D Scenes |
Yen-Chi Cheng et.al. |
2312.14154v1 |
null |
2023-12-21 |
TagAlign: Improving Vision-Language Alignment with Multi-Tag Classification |
Qinying Liu et.al. |
2312.14149v1 |
link |
2023-12-21 |
HeadCraft: Modeling High-Detail Shape Variations for Animated 3DMMs |
Artem Sevastopolsky et.al. |
2312.14140v1 |
null |
2023-12-21 |
Revisiting Foreground and Background Separation in Weakly-supervised Temporal Action Localization: A Clustering-based Approach |
Qinying Liu et.al. |
2312.14138v1 |
link |
2023-12-21 |
Diffusion Reward: Learning Rewards via Conditional Video Diffusion |
Tao Huang et.al. |
2312.14134v1 |
null |
2023-12-21 |
WellFactor: Patient Profiling using Integrative Embedding of Healthcare Data |
Dongjin Choi et.al. |
2312.14129v1 |
null |
2023-12-21 |
VideoPoet: A Large Language Model for Zero-Shot Video Generation |
Dan Kondratyuk et.al. |
2312.14125v1 |
null |
2023-12-21 |
LingoQA: Video Question Answering for Autonomous Driving |
Ana-Maria Marcu et.al. |
2312.14115v1 |
link |
2023-12-21 |
LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding |
Senqiao Yang et.al. |
2312.14074v1 |
null |
2023-12-20 |
Deep Learning on 3D Neural Fields |
Pierluigi Zama Ramirez et.al. |
2312.13277v1 |
null |
2023-12-20 |
The 1/4-BPS building blocks of brane interactions |
Ben Eckardt et.al. |
2312.13269v1 |
null |
2023-12-20 |
ClassLIE: Structure- and Illumination-Adaptive Classification for Low-Light Image Enhancement |
Zixiang Wei et.al. |
2312.13265v1 |
null |
2023-12-20 |
Putting the p back in Prym |
Jeff Achter et.al. |
2312.13263v1 |
null |
2023-12-20 |
The role of data embedding in equivariant quantum convolutional neural networks |
Sreetama Das et.al. |
2312.13250v1 |
null |
2023-12-20 |
Enhancing Neural Training via a Correlated Dynamics Model |
Jonathan Brokman et.al. |
2312.13247v1 |
null |
2023-12-20 |
SISMIK for brain MRI: Deep-learning-based motion estimation and model-based motion correction in k-space |
Oscar Dabrowski et.al. |
2312.13220v1 |
null |
2023-12-20 |
Boost recall in QSO selection from highly imbalanced photometric datasets |
Giorgio Calderone et.al. |
2312.13194v1 |
null |
2023-12-20 |
Ergodic measures for periodic type $\mathbb{Z}^m$-skew-products over Interval Exchange Transformations |
Yuriy Tumarkin et.al. |
2312.13165v1 |
null |
2023-12-20 |
Underwater Acoustic Signal Recognition Based on Salient Features |
Minghao Chen et.al. |
2312.13143v1 |
null |
2023-12-19 |
Tracking Any Object Amodally |
Cheng-Yen Hsieh et.al. |
2312.12433v1 |
null |
2023-12-19 |
The Endoscapes Dataset for Surgical Scene Segmentation, Object Detection, and Critical View of Safety Assessment: Official Splits and Benchmark |
Aditya Murali et.al. |
2312.12429v1 |
null |
2023-12-19 |
Chasing Fairness in Graphs: A GNN Architecture Perspective |
Zhimeng Jiang et.al. |
2312.12369v1 |
link |
2023-12-19 |
Easy quantum groups |
Teo Banica et.al. |
2312.12368v1 |
null |
2023-12-19 |
SMC-NCA: Semantic-guided Multi-level Contrast for Semi-supervised Action Segmentation |
Feixiang Zhou et.al. |
2312.12347v1 |
null |
2023-12-19 |
On the Effectiveness of Retrieval, Alignment, and Replay in Manipulation |
Norman Di Palo et.al. |
2312.12345v1 |
null |
2023-12-19 |
Full-reference Video Quality Assessment for User Generated Content Transcoding |
Zihao Qi et.al. |
2312.12317v1 |
null |
2023-12-19 |
First qualitative observations on deep learning vision model YOLO and DETR for automated driving in Austria |
Stefan Schoder et.al. |
2312.12314v1 |
null |
2023-12-19 |
Holography of New Conformal Higher Spin Gravities in 3d |
I. Lovrekovic et.al. |
2312.12301v1 |
null |
2023-12-19 |
Prompt-based Domain Discrimination for Multi-source Time Series Domain Adaptation |
Junxiang Wang et.al. |
2312.12276v1 |
null |
2023-12-18 |
Development and Evaluation of Ensemble Learning-based Environmental Methane Detection and Intensity Prediction Models |
Reek Majumder et.al. |
2312.10879v1 |
null |
2023-12-18 |
Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation |
Hui Fu et.al. |
2312.10877v1 |
null |
2023-12-17 |
Global relaxation-based LP-Newton method for multiple hyperparameter selection in support vector classification with feature selection |
Qingna Li et.al. |
2312.10848v1 |
null |
2023-12-17 |
Online Boosting Adaptive Learning under Concept Drift for Multistream Classification |
En Yu et.al. |
2312.10841v1 |
null |
2023-12-17 |
Learning to Act without Actions |
Dominik Schmidt et.al. |
2312.10812v1 |
null |
2023-12-17 |
Land use/land cover classification of fused Sentinel-1 and Sentinel-2 imageries using ensembles of Random Forests |
Shivam Pande et.al. |
2312.10798v1 |
null |
2023-12-17 |
Learning to Learn in Interactive Constraint Acquisition |
Dimos Tsouros et.al. |
2312.10795v1 |
null |
2023-12-17 |
Identification of Knowledge Neurons in Protein Language Models |
Divya Nori et.al. |
2312.10770v1 |
null |
2023-12-17 |
Multi-Label Classification of COVID-Tweets Using Large Language Models |
Aniket Deroy et.al. |
2312.10748v1 |
link |
2023-12-17 |
Unmasking Deepfake Faces from Videos Using An Explainable Cost-Sensitive Deep Learning Approach |
Faysal Mahmud et.al. |
2312.10740v1 |
link |
2023-12-15 |
Understanding Probe Behaviors through Variational Bounds of Mutual Information |
Kwanghee Choi et.al. |
2312.10019v1 |
link |
2023-12-15 |
Wearable Coaxially-shielded Metamaterial for Magnetic Resonance Imaging |
Xia Zhu et.al. |
2312.10018v1 |
null |
2023-12-15 |
On the Invertibility of Euler Integral Transforms with Hyperplanes and Quadric Hypersurfaces |
Mattie Ji et.al. |
2312.10002v1 |
null |
2023-12-15 |
Towards Architecture-Insensitive Untrained Network Priors for Accelerated MRI Reconstruction |
Yilin Liu et.al. |
2312.09988v1 |
null |
2023-12-15 |
DHFormer: A Vision Transformer-Based Attention Module for Image Dehazing |
Abdul Wasi et.al. |
2312.09955v1 |
null |
2023-12-15 |
Multi-level graph learning for audio event classification and human-perceived annoyance rating prediction |
Yuanbo Hou et.al. |
2312.09952v1 |
null |
2023-12-15 |
LogoStyleFool: Vitiating Video Recognition Systems via Logo Style Transfer |
Yuxin Cao et.al. |
2312.09935v1 |
link |
2023-12-15 |
RDR: the Recap, Deliberate, and Respond Method for Enhanced Language Understanding |
Yuxin Zi et.al. |
2312.09932v1 |
null |
2023-12-15 |
Reliable Probabilistic Classification with Neural Networks |
Harris Papadopoulos et.al. |
2312.09912v1 |
null |
2023-12-15 |
TMP: Temporal Motion Propagation for Online Video Super-Resolution |
Zhengqiang Zhang et.al. |
2312.09909v1 |
null |
2023-12-14 |
3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting |
Zhiyin Qian et.al. |
2312.09228v1 |
null |
2023-12-14 |
Efficient Online Learning of Contact Force Models for Connector Insertion |
Kevin Tracy et.al. |
2312.09190v1 |
null |
2023-12-14 |
General Object Foundation Model for Images and Videos at Scale |
Junfeng Wu et.al. |
2312.09158v1 |
null |
2023-12-14 |
Evaluating Augmented Reality Communication: How Can We Teach Procedural Skill in AR? |
Manuel Rebol et.al. |
2312.09152v1 |
null |
2023-12-14 |
Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model Splitting |
Anthony Chen et.al. |
2312.09148v1 |
null |
2023-12-14 |
Class-Wise Buffer Management for Incremental Object Detection: An Effective Buffer Training Strategy |
Junsu Kim et.al. |
2312.09139v1 |
null |
2023-12-14 |
Less is more -- the Dispatcher/ Executor principle for multi-task Reinforcement Learning |
Martin Riedmiller et.al. |
2312.09120v1 |
null |
2023-12-14 |
VideoLCM: Video Latent Consistency Model |
Xiang Wang et.al. |
2312.09109v1 |
null |
2023-12-14 |
FastInject: Injecting Unpaired Text Data into CTC-based ASR training |
Keqi Deng et.al. |
2312.09100v1 |
null |
2023-12-14 |
Agent Attention: On the Integration of Softmax and Linear Attention |
Dongchen Han et.al. |
2312.08874v1 |
link |
2023-12-13 |
VLAP: Efficient Video-Language Alignment via Frame Prompting and Distilling for Video Question Answering |
Xijun Wang et.al. |
2312.08367v1 |
null |
2023-12-13 |
Challenges and Opportunities in Implementing Negative Differential Resistance Mode Reconfigurable Field Effect Transistors |
Lephe S et.al. |
2312.08351v1 |
null |
2023-12-13 |
Ehancing CT Image synthesis from multi-modal MRI data based on a multi-task neural network framework |
Zhuoyao Xin et.al. |
2312.08343v1 |
null |
2023-12-13 |
Preparing VVC for Streaming: A Fast Multi-Rate Encoding Approach |
Yiqun Liu et.al. |
2312.08330v1 |
null |
2023-12-13 |
Affine monoids of corank one |
Yulia Zaitseva et.al. |
2312.08316v1 |
null |
2023-12-13 |
VQ-HPS: Human Pose and Shape Estimation in a Vector-Quantized Latent Space |
Guénolé Fiche et.al. |
2312.08291v1 |
null |
2023-12-13 |
PhenDiff: Revealing Invisible Phenotypes with Conditional Diffusion Models |
Anis Bourou et.al. |
2312.08290v1 |
link |
2023-12-13 |
On the verification of Embeddings using Hybrid Markov Logic |
Anup Shakya et.al. |
2312.08287v1 |
null |
2023-12-14 |
High-throughput Biomedical Relation Extraction for Semi-Structured Web Articles Empowered by Large Language Models |
Songchi Zhou et.al. |
2312.08274v2 |
null |
2023-12-13 |
Efficient Multi-Object Pose Estimation using Multi-Resolution Deformable Attention and Query Aggregation |
Arul Selvam Periyasamy et.al. |
2312.08268v1 |
null |
2023-12-12 |
diff History for Long-Context Language Agents |
Ulyana Piterbarg et.al. |
2312.07540v1 |
null |
2023-12-12 |
FreeInit: Bridging Initialization Gap in Video Diffusion Models |
Tianxing Wu et.al. |
2312.07537v1 |
link |
2023-12-12 |
WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion |
Soyong Shin et.al. |
2312.07531v1 |
null |
2023-12-12 |
RTMO: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation |
Peng Lu et.al. |
2312.07526v1 |
link |
2023-12-12 |
PEEKABOO: Interactive Video Generation via Masked-Diffusion |
Yash Jain et.al. |
2312.07509v1 |
null |
2023-12-12 |
NAC-TCN: Temporal Convolutional Networks with Causal Dilated Neighborhood Attention for Emotion Understanding |
Alexander Mehta et.al. |
2312.07507v1 |
link |
2023-12-12 |
COLMAP-Free 3D Gaussian Splatting |
Yang Fu et.al. |
2312.07504v1 |
null |
2023-12-12 |
NearbyPatchCL: Leveraging Nearby Patches for Self-Supervised Patch-Level Multi-Class Classification in Whole-Slide Images |
Gia-Bao Le et.al. |
2312.07489v1 |
null |
2023-12-12 |
MinD-3D: Reconstruct High-quality 3D objects in Human Brain |
Jianxiong Gao et.al. |
2312.07485v1 |
null |
2023-12-12 |
Classification of retail products: From probabilistic ranking to neural networks |
Manar Mohamed Hafez et.al. |
2312.07482v1 |
null |
2023-12-11 |
Photorealistic Video Generation with Diffusion Models |
Agrim Gupta et.al. |
2312.06662v1 |
null |
2023-12-11 |
LightSim: Neural Lighting Simulation for Urban Scenes |
Ava Pun et.al. |
2312.06654v1 |
null |
2023-12-11 |
Beyond Classification: Definition and Density-based Estimation of Calibration in Object Detection |
Teodora Popordanoska et.al. |
2312.06645v1 |
null |
2023-12-11 |
Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution |
Shangchen Zhou et.al. |
2312.06640v1 |
null |
2023-12-12 |
TMT-VIS: Taxonomy-aware Multi-dataset Joint Training for Video Instance Segmentation |
Rongkun Zheng et.al. |
2312.06630v2 |
link |
2023-12-11 |
Neural Text to Articulate Talk: Deep Text to Audiovisual Speech Synthesis achieving both Auditory and Photo-realism |
Georgios Milis et.al. |
2312.06613v1 |
link |
2023-12-11 |
Early Action Recognition with Action Prototypes |
Guglielmo Camporese et.al. |
2312.06598v1 |
null |
2023-12-11 |
Flexible visual prompts for in-context learning in computer vision |
Thomas Foster et.al. |
2312.06592v1 |
link |
2023-12-11 |
QuickQuakeBuildings: Post-earthquake SAR-Optical Dataset for Quick Damaged-building Detection |
Yao Sun et.al. |
2312.06587v1 |
null |
2023-12-12 |
ESO/HARPS Radial Velocities Catalog |
Mauro Barbieri et.al. |
2312.06586v2 |
null |
2023-12-08 |
The Long Secondary Period (LSP) Variables: Overview and Some Analysis |
John R. Percy et.al. |
2312.05255v1 |
null |
2023-12-08 |
Few-Shot Class-Incremental Learning via Training-Free Prototype Calibration |
Qi-Wei Wang et.al. |
2312.05229v1 |
null |
2023-12-08 |
Shape Matters: Detecting Vertebral Fractures Using Differentiable Point-Based Shape Decoding |
Hellena Hempe et.al. |
2312.05220v1 |
link |
2023-12-08 |
Enhancing Facial Classification and Recognition using 3D Facial Models and Deep Learning |
Houting Li et.al. |
2312.05219v1 |
null |
2023-12-08 |
IntrinsicAvatar: Physically Based Inverse Rendering of Dynamic Humans from Monocular Videos via Explicit Ray Tracing |
Shaofei Wang et.al. |
2312.05210v1 |
null |
2023-12-08 |
Embedding theory in ML toward real-time tracking of structural dynamics through hyperspectral datasets |
Jonathan D Hollenbach et.al. |
2312.05201v1 |
null |
2023-12-08 |
Video-Based Rendering Techniques: A Survey |
Rafael Kuffner dos Anjos et.al. |
2312.05179v1 |
null |
2023-12-08 |
Enhancing Single-Frame Supervision for Better Temporal Action Localization |
Changjian Chen et.al. |
2312.05178v1 |
null |
2023-12-08 |
MRI Scan Synthesis Methods based on Clustering and Pix2Pix |
Giulia Baldini et.al. |
2312.05176v1 |
null |
2023-12-08 |
TriHuman : A Real-time and Controllable Tri-plane Representation for Detailed Human Geometry and Appearance Synthesis |
Heming Zhu et.al. |
2312.05161v1 |
null |
2023-12-07 |
GenDeF: Learning Generative Deformation Field for Video Generation |
Wen Wang et.al. |
2312.04561v1 |
null |
2023-12-07 |
MonoGaussianAvatar: Monocular Gaussian Point-based Head Avatar |
Yufan Chen et.al. |
2312.04558v1 |
null |
2023-12-07 |
GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation |
Shoufa Chen et.al. |
2312.04557v1 |
null |
2023-12-07 |
SPIDeRS: Structured Polarization for Invisible Depth and Reflectance Sensing |
Tomoki Ichikawa et.al. |
2312.04553v1 |
null |
2023-12-07 |
PlayFusion: Skill Acquisition via Diffusion from Language-Annotated Play |
Lili Chen et.al. |
2312.04549v1 |
null |
2023-12-07 |
Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve Aerial Visual Perception? |
Aritra Dutta et.al. |
2312.04548v1 |
null |
2023-12-07 |
Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language Models |
Ivan Kapelyukh et.al. |
2312.04533v1 |
null |
2023-12-07 |
Camera Height Doesn't Change: Unsupervised Monocular Scale-Aware Road-Scene Depth Estimation |
Genki Kinoshita et.al. |
2312.04530v1 |
null |
2023-12-07 |
RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models |
Ozgur Kara et.al. |
2312.04524v1 |
link |
2023-12-07 |
Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation |
Zhiwu Qing et.al. |
2312.04483v1 |
null |
2023-12-06 |
OneLLM: One Framework to Align All Modalities with Language |
Jiaming Han et.al. |
2312.03700v1 |
link |
2023-12-07 |
Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers |
Umberto Cappellazzo et.al. |
2312.03694v2 |
null |
2023-12-06 |
Direct Exoplanet Detection Using Deep Convolutional Image Reconstruction (ConStruct): A New Algorithm for Post-Processing High-Contrast Images |
Trevor N. Wolf et.al. |
2312.03671v1 |
null |
2023-12-06 |
Annihilating branching Brownian motion |
Daniel Ahlberg et.al. |
2312.03669v1 |
null |
2023-12-06 |
Towards small and accurate convolutional neural networks for acoustic biodiversity monitoring |
Serge Zaugg et.al. |
2312.03666v1 |
null |
2023-12-06 |
Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving |
Ming Nie et.al. |
2312.03661v1 |
link |
2023-12-06 |
Editable Stain Transformation Of Histological Images Using Unpaired GANs |
Tibor Sloboda et.al. |
2312.03647v1 |
link |
2023-12-06 |
MotionCtrl: A Unified and Flexible Motion Controller for Video Generation |
Zhouxia Wang et.al. |
2312.03641v1 |
null |
2023-12-06 |
Training Neural Networks on RAW and HDR Images for Restoration Tasks |
Lei Luo et.al. |
2312.03640v1 |
link |
2023-12-07 |
Evaluation of Active Feature Acquisition Methods for Static Feature Settings |
Henrik von Kleist et.al. |
2312.03619v2 |
null |
2023-12-05 |
Dexterous Functional Grasping |
Ananye Agarwal et.al. |
2312.02975v1 |
null |
2023-12-05 |
Describing Differences in Image Sets with Natural Language |
Lisa Dunlap et.al. |
2312.02974v1 |
link |
2023-12-05 |
GauHuman: Articulated Gaussian Splatting from Monocular Human Videos |
Shoukang Hu et.al. |
2312.02973v1 |
link |
2023-12-05 |
Detecting algorithmic bias in medical AI-models |
Jeffrey Smith et.al. |
2312.02959v1 |
null |
2023-12-05 |
Classification for everyone : Building geography agnostic models for fairer recognition |
Akshat Jindal et.al. |
2312.02957v1 |
null |
2023-12-05 |
Choroidalyzer: An open-source, end-to-end pipeline for choroidal analysis in optical coherence tomography |
Justin Engelmann et.al. |
2312.02956v1 |
null |
2023-12-05 |
An alternating peak-optimization method for optimal trajectory generation of quadrotor drones |
Wytze A. B. de Vries et.al. |
2312.02944v1 |
null |
2023-12-05 |
Fast CT anatomic localization algorithm |
Amit Oved et.al. |
2312.02941v1 |
null |
2023-12-05 |
Drag-A-Video: Non-rigid Video Editing with Point-based Interaction |
Yao Teng et.al. |
2312.02936v1 |
null |
2023-12-06 |
WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation |
Jiachen Lu et.al. |
2312.02934v2 |
link |
2023-12-04 |
iMatching: Imperative Correspondence Learning |
Zitong Zhan et.al. |
2312.02141v1 |
null |
2023-12-04 |
Fast View Synthesis of Casual Videos |
Yao-Chih Lee et.al. |
2312.02135v1 |
null |
2023-12-04 |
GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians |
Liangxiao Hu et.al. |
2312.02134v1 |
null |
2023-12-04 |
Hot PATE: Private Aggregation of Distributions for Diverse Task |
Edith Cohen et.al. |
2312.02132v1 |
null |
2023-12-04 |
Can we truly transfer an actor's genuine happiness to avatars? An investigation into virtual, real, posed and spontaneous faces |
Vitor Miguel Xavier Peres et.al. |
2312.02128v1 |
null |
2023-12-04 |
Cosmic star-formation history and black hole accretion history inferred from the JWST mid-infrared source counts |
Seong Jin Kim et.al. |
2312.02090v1 |
null |
2023-12-05 |
VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence |
Yuchao Gu et.al. |
2312.02087v2 |
null |
2023-12-04 |
Integrating AI into CCTV Systems: A Comprehensive Evaluation of Smart Video Surveillance in Community Space |
Shanle Yao et.al. |
2312.02078v1 |
null |
2023-12-04 |
GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians |
Shenhan Qian et.al. |
2312.02069v1 |
null |
2023-12-04 |
TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding |
Shuhuai Ren et.al. |
2312.02051v1 |
null |
2023-12-01 |
Dense Optical Tracking: Connecting the Dots |
Guillaume Le Moing et.al. |
2312.00786v1 |
null |
2023-12-01 |
Sequential Modeling Enables Scalable Learning for Large Vision Models |
Yutong Bai et.al. |
2312.00785v1 |
null |
2023-12-01 |
MorpheuS: Neural Dynamic 360° Surface Reconstruction from Monocular RGB-D Video |
Hengyi Wang et.al. |
2312.00778v1 |
null |
2023-12-01 |
VideoBooth: Diffusion-based Video Generation with Image Prompts |
Yuming Jiang et.al. |
2312.00777v1 |
null |
2023-12-01 |
Towards Generalizable Zero-Shot Manipulation via Translating Human Interaction Plans |
Homanga Bharadhwaj et.al. |
2312.00775v1 |
null |
2023-12-01 |
Explaining Knock-on Effects of Bias Mitigation |
Svetoslav Nizhnichenkov et.al. |
2312.00765v1 |
null |
2023-12-04 |
Deep Unlearning: Fast and Efficient Training-free Approach to Controlled Forgetting |
Sangamesh Kodge et.al. |
2312.00761v2 |
null |
2023-12-01 |
Mitigating Over-smoothing in Transformers via Regularized Nonlocal Functionals |
Tam Nguyen et.al. |
2312.00751v1 |
null |
2023-12-01 |
Tight-minimal dichotomies in Banach spaces |
Alejandra C. Cáceres-Rigo et.al. |
2312.00721v1 |
null |
2023-12-01 |
GIFT: Generative Interpretable Fine-Tuning Transformers |
Chinmay Savadikar et.al. |
2312.00700v1 |
link |
2023-11-30 |
Just Add $π$! Pose Induced Video Transformers for Understanding Activities of Daily Living |
Dominick Reilly et.al. |
2311.18840v1 |
null |
2023-11-30 |
TrafficMOT: A Challenging Dataset for Multi-Object Tracking in Complex Traffic Scenarios |
Lihao Liu et.al. |
2311.18839v1 |
null |
2023-11-30 |
VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models |
Zhen Xing et.al. |
2311.18837v1 |
null |
2023-11-30 |
ART$\boldsymbol{\cdot}$V: Auto-Regressive Text-to-Video Generation with Diffusion Models |
Wenming Weng et.al. |
2311.18834v1 |
null |
2023-11-30 |
MotionEditor: Editing Video Motion via Content-Aware Diffusion |
Shuyuan Tu et.al. |
2311.18830v1 |
link |
2023-11-30 |
MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation |
Yanhui Wang et.al. |
2311.18829v1 |
null |
2023-11-30 |
Motion-Conditioned Image Animation for Video Editing |
Wilson Yan et.al. |
2311.18827v1 |
null |
2023-11-30 |
CAST: Cross-Attention in Space and Time for Video Action Recognition |
Dongho Lee et.al. |
2311.18825v1 |
link |
2023-11-30 |
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking |
Kaifeng Lyu et.al. |
2311.18817v1 |
link |
2023-11-30 |
BIOCLIP: A Vision Foundation Model for the Tree of Life |
Samuel Stevens et.al. |
2311.18803v1 |
null |
2023-11-30 |
Do text-free diffusion models learn discriminative visual representations? |
Soumik Mukhopadhyay et.al. |
2311.17921v2 |
null |
2023-11-29 |
Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving |
Yuqi Wang et.al. |
2311.17918v1 |
link |
2023-11-29 |
HUGS: Human Gaussian Splats |
Muhammed Kocabas et.al. |
2311.17910v1 |
null |
2023-11-29 |
SODA: Bottleneck Diffusion Models for Representation Learning |
Drew A. Hudson et.al. |
2311.17901v1 |
null |
2023-11-30 |
Knowledge Pursuit Prompting for Zero-Shot Multimodal Synthesis |
Jinqi Luo et.al. |
2311.17898v2 |
null |
2023-11-29 |
On the geometry of tensor products over finite fields |
Stefano Lia et.al. |
2311.17896v1 |
null |
2023-11-29 |
Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation |
Shuangrui Ding et.al. |
2311.17893v1 |
null |
2023-11-29 |
TSDF-Sampling: Efficient Sampling for Neural Surface Field using Truncated Signed Distance Field |
Chaerin Min et.al. |
2311.17878v1 |
null |
2023-11-29 |
Enhancing Post-Hoc Explanation Benchmark Reliability for Image Classification |
Tristan Gomez et.al. |
2311.17876v1 |
null |
2023-11-29 |
On the Adversarial Robustness of Graph Contrastive Learning Methods |
Filippo Guerranti et.al. |
2311.17853v1 |
null |
2023-11-28 |
Panoptic Video Scene Graph Generation |
Jingkang Yang et.al. |
2311.17058v1 |
link |
2023-11-28 |
Self-Supervised Motion Magnification by Backpropagating Through Optical Flow |
Zhaoying Pan et.al. |
2311.17056v1 |
null |
2023-11-28 |
MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training |
Pavan Kumar Anasosalu Vasu et.al. |
2311.17049v1 |
null |
2023-11-28 |
Jets of foliations and $b^k$-algebroids |
Francis Bischoff et.al. |
2311.17045v1 |
null |
2023-11-28 |
LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models |
Yanwei Li et.al. |
2311.17043v1 |
link |
2023-11-29 |
Efficient In-Context Learning in Vision-Language Models for Egocentric Videos |
Keunwoo Peter Yu et.al. |
2311.17041v2 |
null |
2023-11-28 |
Space-Time Diffusion Features for Zero-Shot Text-Driven Motion Transfer |
Danah Yatim et.al. |
2311.17009v1 |
null |
2023-11-28 |
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark |
Kunchang Li et.al. |
2311.17005v1 |
link |
2023-11-28 |
Mirković-Vilonen Polytopes from Combinatorics |
Mario Sanchez et.al. |
2311.16979v1 |
null |
2023-11-28 |
Natural Language Processing Through Transfer Learning: A Case Study on Sentiment Analysis |
Aman Yadav et.al. |
2311.16965v1 |
null |
2023-11-28 |
Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models |
Munan Ning et.al. |
2311.16103v2 |
link |
2023-11-27 |
GART: Gaussian Articulated Template Models |
Jiahui Lei et.al. |
2311.16099v1 |
null |
2023-11-27 |
On Bringing Robots Home |
Nur Muhammad Mahi Shafiullah et.al. |
2311.16098v1 |
link |
2023-11-27 |
CG-HOI: Contact-Guided 3D Human-Object Interaction Generation |
Christian Diller et.al. |
2311.16097v1 |
null |
2023-11-27 |
Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling |
Zhe Li et.al. |
2311.16096v1 |
link |
2023-11-27 |
Three-dimensional $\mathbb{Z}$ topological insulators without reflection symmetry |
Alexander C. Tyner et.al. |
2311.16092v1 |
null |
2023-11-27 |
BERT Goes Off-Topic: Investigating the Domain Transfer Challenge using Genre Classification |
Dmitri Roussinov et.al. |
2311.16083v1 |
link |
2023-11-27 |
ViT-Lens-2: Gateway to Omni-modal Intelligence |
Weixian Lei et.al. |
2311.16081v1 |
link |
2023-11-27 |
Correlated Spectral and Recurrence Variations of Cygnus X-1 |
E. M. Broadbent et.al. |
2311.16070v1 |
null |
2023-11-27 |
DiffSLVA: Harnessing Diffusion Models for Sign Language Video Anonymization |
Zhaoyang Xia et.al. |
2311.16060v1 |
link |
2023-11-24 |
SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation |
Lingchen Meng et.al. |
2311.14671v1 |
link |
2023-11-24 |
JetLOV: Enhancing Jet Tree Tagging through Neural Network Learning of Optimal LundNet Variables |
Mauricio A. Diaz et.al. |
2311.14654v1 |
link |
2023-11-24 |
Learning in Deep Factor Graphs with Gaussian Belief Propagation |
Seth Nabarro et.al. |
2311.14649v1 |
null |
2023-11-24 |
Continuous football player tracking from discrete broadcast data |
Matthew J. Penn et.al. |
2311.14642v1 |
null |
2023-11-24 |
Emergent Topology in Many-Body Dissipative Quantum Chaos |
Antonio M. García-García et.al. |
2311.14640v1 |
null |
2023-11-24 |
Unsupervised high-throughput segmentation of cells and cell nuclei in quantitative phase images |
Julia Sistermanns et.al. |
2311.14639v1 |
null |
2023-11-24 |
ARIA: On the interaction between Architectures, Aggregation methods and Initializations in federated visual classification |
Vasilis Siomos et.al. |
2311.14625v1 |
null |
2023-11-24 |
Neural Style Transfer for Computer Games |
Eleftherios Ioannou et.al. |
2311.14617v1 |
null |
2023-11-24 |
Animate124: Animating One Image to 4D Dynamic Scene |
Yuyang Zhao et.al. |
2311.14603v1 |
null |
2023-11-24 |
A Metalearned Neural Circuit for Nonparametric Bayesian Inference |
Jake C. Snell et.al. |
2311.14601v1 |
link |
2023-11-22 |
WildFusion: Learning 3D-Aware Latent Diffusion Models in View Space |
Katja Schwarz et.al. |
2311.13570v1 |
null |
2023-11-22 |
Belted sum decompositions of fully augmented links |
Porter Morgan et.al. |
2311.13540v1 |
null |
2023-11-22 |
Learned Nonlinear Predictor for Critically Sampled 3D Point Cloud Attribute Compression |
Tam Thuc Do et.al. |
2311.13539v1 |
null |
2023-11-22 |
Leveraging CNNs and Ensemble Learning for Automated Disaster Image Classification |
Archit Rathod et.al. |
2311.13531v1 |
null |
2023-11-22 |
Applying Dimensionality Reduction as Precursor to LSTM-CNN Models for Classifying Imagery and Motor Signals in ECoG-Based BCIs |
Soham Bafana et.al. |
2311.13507v1 |
link |
2023-11-22 |
Current Topological and Machine Learning Applications for Bias Detection in Text |
Colleen Farrelly et.al. |
2311.13495v1 |
null |
2023-11-22 |
Benchmarking Toxic Molecule Classification using Graph Neural Networks and Few Shot Learning |
Bhavya Mehta et.al. |
2311.13490v1 |
null |
2023-11-22 |
Deep-learning-based acceleration of MRI for radiotherapy planning of pediatric patients with brain tumors |
Shahinur Alam et.al. |
2311.13485v1 |
link |
2023-11-22 |
Solution discovery via reconfiguration for problems in P |
Mario Grobler et.al. |
2311.13478v1 |
null |
2023-11-22 |
Experimentation in Early-Stage Video Game Startups: Practices and Challenges |
Henry Edison et.al. |
2311.13462v1 |
null |
2023-11-21 |
Physics-guided Shape-from-Template: Monocular Video Perception through Neural Surrogate Models |
David Stotko et.al. |
2311.12796v1 |
null |
2023-11-21 |
Quantifying Impairment and Disease Severity Using AI Models Trained on Healthy Subjects |
Boyang Yu et.al. |
2311.12781v1 |
link |
2023-11-21 |
Swift Parameter-free Attention Network for Efficient Super-Resolution |
Cheng Wan et.al. |
2311.12770v1 |
link |
2023-11-22 |
Investigating Weight-Perturbed Deep Neural Networks With Application in Iris Presentation Attack Detection |
Renu Sharma et.al. |
2311.12764v2 |
link |
2023-11-21 |
High-resolution Image-based Malware Classification using Multiple Instance Learning |
Tim Peters et.al. |
2311.12760v1 |
link |
2023-11-21 |
SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction |
Yuanhui Huang et.al. |
2311.12754v1 |
link |
2023-11-21 |
Image Transformation for IoT Time-Series Data: A Review |
Duygu Altunkaya et.al. |
2311.12742v1 |
null |
2023-11-21 |
Exploring Graph Classification Techniques Under Low Data Constraints: A Comprehensive Study |
Kush Kothari et.al. |
2311.12737v1 |
null |
2023-11-21 |
Not Just Training, Also Testing: High School Youths' Perspective-Taking through Peer Testing Machine Learning-Powered Applications |
L. Morales-Navarro et.al. |
2311.12733v1 |
null |
2023-11-21 |
Cascade Learning Localises Discriminant Features in Visual Scene Classification |
Junwen Wang et.al. |
2311.12704v1 |
null |
2023-11-20 |
Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation |
Wenhao Li et.al. |
2311.12028v1 |
null |
2023-11-20 |
GPT-4V(ision) for Robotics: Multimodal Task Planning from Human Demonstration |
Naoki Wake et.al. |
2311.12015v1 |
null |
2023-11-20 |
Evaluating Supervision Levels Trade-Offs for Infrared-Based People Counting |
David Latortue et.al. |
2311.11974v1 |
null |
2023-11-20 |
SA-Med2D-20M Dataset: Segment Anything in 2D Medical Imaging with 20 Million masks |
Jin Ye et.al. |
2311.11969v1 |
link |
2023-11-20 |
Correlated Attention in Transformers for Multivariate Time Series |
Quang Minh Nguyen et.al. |
2311.11959v1 |
null |
2023-11-20 |
Tubular Curvature Filter: Implicit Pointwise Curvature Calculation Method for Tubular Objects |
Elifnur Sunger et.al. |
2311.11931v1 |
null |
2023-11-20 |
LLMs as Visual Explainers: Advancing Image Classification with Evolving Visual Descriptions |
Songhao Han et.al. |
2311.11904v1 |
null |
2023-11-20 |
Multimodal Characterization of Emotion within Multimedia Space |
Dayo Samuel Banjo et.al. |
2311.11892v1 |
null |
2023-11-20 |
SniffyArt: The Dataset of Smelling Persons |
Mathias Zinnen et.al. |
2311.11888v1 |
null |
2023-11-20 |
Multi-Task Faces (MTF) Data Set: A Legally and Ethically Compliant Collection of Face Images for Various Classification Tasks |
Rami Haffar et.al. |
2311.11882v1 |
link |
2023-11-17 |
Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning |
Rohit Girdhar et.al. |
2311.10709v1 |
null |
2023-11-17 |
SpACNN-LDVAE: Spatial Attention Convolutional Latent Dirichlet Variational Autoencoder for Hyperspectral Pixel Unmixing |
Soham Chitnis et.al. |
2311.10701v1 |
null |
2023-11-17 |
A note on the convergence of the Bayesian entropy estimator for exchangeable partitions |
Servet Martinez et.al. |
2311.10698v1 |
null |
2023-11-17 |
Distilling and Retrieving Generalizable Knowledge for Robot Manipulation via Language Corrections |
Lihan Zha et.al. |
2311.10678v1 |
link |
2023-11-17 |
3D-TexSeg: Unsupervised Segmentation of 3D Texture using Mutual Transformer Learning |
Iyyakutti Iyappan Ganapathi et.al. |
2311.10651v1 |
null |
2023-11-17 |
User Dynamics-Aware Edge Caching and Computing for Mobile Virtual Reality |
Mushu Li et.al. |
2311.10645v1 |
null |
2023-11-17 |
Image-Domain Material Decomposition for Dual-energy CT using Unsupervised Learning with Data-fidelity Loss |
Junbo Peng et.al. |
2311.10641v1 |
null |
2023-11-17 |
Scaling TabPFN: Sketching and Feature Selection for Tabular Prior-Data Fitted Networks |
Benjamin Feuer et.al. |
2311.10609v1 |
null |
2023-11-17 |
Designing Reconfigurable Intelligent Systems with Markov Blankets |
Boris Sedlak et.al. |
2311.10597v1 |
null |
2023-11-17 |
FOCAL: A Cost-Aware Video Dataset for Active Learning |
Kiran Kokilepersaud et.al. |
2311.10591v1 |
link |
2023-11-16 |
Traffic Video Object Detection using Motion Prior |
Lihao Liu et.al. |
2311.10092v1 |
null |
2023-11-16 |
Moduli space of rank three logarithmic connections on the projective line with three poles |
Takafumi Matsumoto et.al. |
2311.10071v1 |
null |
2023-11-16 |
Inherently Interpretable Time Series Classification via Multiple Instance Learning |
Joseph Early et.al. |
2311.10049v1 |
link |
2023-11-16 |
On the potential of Carbon-Enhanced Metal-Poor stars for Galactic Archaeology |
Aruna Goswami et.al. |
2311.10043v1 |
null |
2023-11-16 |
Match and Locate: low-frequency monocular odometry based on deep feature matching |
Stepan Konev et.al. |
2311.10034v1 |
null |
2023-11-16 |
Revolutionizing Customer Interactions: Insights and Challenges in Deploying ChatGPT and Generative Chatbots for FAQs |
Feriel Khennouche et.al. |
2311.09976v1 |
null |
2023-11-16 |
From Pretext to Purpose: Batch-Adaptive Self-Supervised Learning |
Jiansong Zhang et.al. |
2311.09974v1 |
null |
2023-11-16 |
VertDetect: Fully End-to-End 3D Vertebral Instance Segmentation Model |
Geoff Klein et.al. |
2311.09958v1 |
null |
2023-11-16 |
Harnessing Transformers: A Leap Forward in Lung Cancer Image Detection |
Amine Bechar et.al. |
2311.09942v1 |
null |
2023-11-17 |
A Framework for Monitoring and Retraining Language Models in Real-World Applications |
Jaykumar Kasundra et.al. |
2311.09930v2 |
null |
2023-11-15 |
Single-Image 3D Human Digitization with Shape-Guided Diffusion |
Badour AlBahar et.al. |
2311.09221v1 |
null |
2023-11-15 |
ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy |
Kirill Vishniakov et.al. |
2311.09215v1 |
link |
2023-11-15 |
Topology of Pulsar Profiles (ToPP). I. Graph theory method and classification of the EPN |
D. Vohl et.al. |
2311.09201v1 |
null |
2023-11-15 |
ExpM+NF: Differentially Private Machine Learning that Surpasses DPSGD |
Robert A. Bridges et.al. |
2311.09200v1 |
null |
2023-11-15 |
Domain Aligned CLIP for Few-shot Classification |
Muhammad Waleed Gondal et.al. |
2311.09191v1 |
null |
2023-11-15 |
ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models |
Jierui Li et.al. |
2311.09182v1 |
null |
2023-11-15 |
RBPGAN: Recurrent Back-Projection GAN for Video Super Resolution |
Dareen Hussein et.al. |
2311.09178v1 |
null |
2023-11-15 |
Model Agnostic Explainable Selective Regression via Uncertainty Estimation |
Andrea Pugnana et.al. |
2311.09145v1 |
null |
2023-11-15 |
Explainable Text Classification Techniques in Legal Document Review: Locating Rationales without Using Human Annotated Training Text Snippets |
Christian Mahoney et.al. |
2311.09133v1 |
null |
2023-11-15 |
Cross-view and Cross-pose Completion for 3D Human Understanding |
Matthieu Armando et.al. |
2311.09104v1 |
null |
2023-11-14 |
MVSA-Net: Multi-View State-Action Recognition for Robust and Deployable Trajectory Generation |
Ehsan Asali et.al. |
2311.08393v1 |
null |
2023-11-14 |
USLR: an open-source tool for unbiased and smooth longitudinal registration of brain MR |
Adrià Casamitjana et.al. |
2311.08371v1 |
link |
2023-11-14 |
Inverse Learning with Extremely Sparse Feedback for Recommendation |
Guanyu Lin et.al. |
2311.08302v1 |
null |
2023-11-14 |
Level Set KSVD |
Omer Sapir et.al. |
2311.08284v1 |
null |
2023-11-14 |
TENT: Connect Language Models with IoT Sensors for Zero-Shot Activity Recognition |
Yunjiao Zhou et.al. |
2311.08245v1 |
null |
2023-11-14 |
MCMC to address model misspecification in Deep Learning classification of Radio Galaxies |
Devina Mohan et.al. |
2311.08243v1 |
null |
2023-11-14 |
Learning Physics-Inspired Regularization for Medical Image Registration with Hypernetworks |
Anna Reithmeir et.al. |
2311.08239v1 |
link |
2023-11-14 |
Counterfactual Explanation for Regression via Disentanglement in Latent Space |
Xuan Zhao et.al. |
2311.08228v1 |
null |
2023-11-14 |
Uni-COAL: A Unified Framework for Cross-Modality Synthesis and Super-Resolution of MR Images |
Zhiyun Song et.al. |
2311.08225v1 |
null |
2023-11-14 |
Eval-GCSC: A New Metric for Evaluating ChatGPT's Performance in Chinese Spelling Correction |
Kunting Li et.al. |
2311.08219v1 |
link |
2023-11-13 |
GPT-4V(ision) as A Social Media Analysis Engine |
Hanjia Lyu et.al. |
2311.07547v1 |
link |
2023-11-13 |
mlscorecheck: Testing the consistency of reported performance scores and experiments in machine learning |
György Kovács et.al. |
2311.07541v1 |
null |
2023-11-13 |
FEMDA: a unified framework for discriminant analysis |
Pierre Houdouin et.al. |
2311.07518v1 |
null |
2023-11-13 |
Reducing the Need for Backpropagation and Discovering Better Optima With Explicit Optimizations of Neural Networks |
Jake Ryland Williams et.al. |
2311.07498v1 |
null |
2023-11-13 |
Towards Robotic Tree Manipulation: Leveraging Graph Representations |
Chung Hee Kim et.al. |
2311.07479v1 |
null |
2023-11-13 |
Temporal Performance Prediction for Deep Convolutional Long Short-Term Memory Networks |
Laura Fieback et.al. |
2311.07477v1 |
null |
2023-11-13 |
Masked Face Dataset Generation and Masked Face Recognition |
Rui Cai et.al. |
2311.07475v1 |
link |
2023-11-13 |
A Bayesian Approach to Strong Lens Finding in the Era of Wide-area Surveys |
Philip Holloway et.al. |
2311.07455v1 |
null |
2023-11-13 |
On the Robustness of Neural Collapse and the Neural Collapse of Robustness |
Jingtong Su et.al. |
2311.07444v1 |
null |
2023-11-13 |
Optimising Human-AI Collaboration by Learning Convincing Explanations |
Alex J. Chan et.al. |
2311.07426v1 |
null |
2023-11-10 |
Learning Human Action Recognition Representations Without Real Humans |
Howard Zhong et.al. |
2311.06231v1 |
link |
2023-11-10 |
Semantic-aware Video Representation for Few-shot Action Recognition |
Yutao Tang et.al. |
2311.06218v1 |
null |
2023-11-10 |
MultiIoT: Towards Large-scale Multisensory Learning for the Internet of Things |
Shentong Mo et.al. |
2311.06217v1 |
null |
2023-11-10 |
Deep learning segmentation of fibrous cap in intravascular optical coherence tomography images |
Juhwan Lee et.al. |
2311.06202v1 |
null |
2023-11-10 |
An Automated Pipeline for Tumour-Infiltrating Lymphocyte Scoring in Breast Cancer |
Adam J Shephard et.al. |
2311.06185v1 |
link |
2023-11-10 |
Automatic Report Generation for Histopathology images using pre-trained Vision Transformers |
Saurav Sengupta et.al. |
2311.06176v1 |
null |
2023-11-10 |
Two vertex geometrically irreducible algebras |
Grzegorz Bobinski et.al. |
2311.06173v1 |
null |
2023-11-10 |
Time Scale Network: A Shallow Neural Network For Time Series Data |
Trevor Meyer et.al. |
2311.06170v1 |
null |
2023-11-10 |
Deep Fast Vision: A Python Library for Accelerated Deep Transfer Learning Vision Prototyping |
Fabi Prezja et.al. |
2311.06169v1 |
link |
2023-11-10 |
Going beyond persistent homology using persistent homology |
Johanna Immonen et.al. |
2311.06152v1 |
null |
2023-11-09 |
FogROS2-Sky: Optimizing Latency and Cost for Multi-Cloud Robot Applications |
Kaiyuan Chen et.al. |
2311.05600v1 |
null |
2023-11-09 |
A Coefficient Makes SVRG Effective |
Yida Yin et.al. |
2311.05589v1 |
link |
2023-11-09 |
Outlier-Robust Wasserstein DRO |
Sloan Nietert et.al. |
2311.05573v1 |
link |
2023-11-09 |
Exploring Emotion Expression Recognition in Older Adults Interacting with a Virtual Coach |
Cristina Palmero et.al. |
2311.05567v1 |
null |
2023-11-09 |
Disentangling Quantum and Classical Contributions in Hybrid Quantum Machine Learning Architectures |
Michael Kölle et.al. |
2311.05559v1 |
null |
2023-11-09 |
L-WaveBlock: A Novel Feature Extractor Leveraging Wavelets for Generative Adversarial Networks |
Mirat Shah et.al. |
2311.05548v1 |
null |
2023-11-09 |
BakedAvatar: Baking Neural Fields for Real-Time Head Avatar Synthesis |
Hao-Bin Duan et.al. |
2311.05521v1 |
null |
2023-11-09 |
Dirichlet Active Learning |
Kevin Miller et.al. |
2311.05501v1 |
null |
2023-11-09 |
Retinal OCT Synthesis with Denoising Diffusion Probabilistic Models for Layer Segmentation |
Yuli Wu et.al. |
2311.05479v1 |
null |
2023-11-09 |
Robust Retraining-free GAN Fingerprinting via Personalized Normalization |
Jianwei Fei et.al. |
2311.05478v1 |
null |
2023-11-08 |
Towards Few-Annotation Learning in Computer Vision: Application to Image Classification and Object Detection tasks |
Quentin Bouniot et.al. |
2311.04888v1 |
null |
2023-11-08 |
Are foundation models efficient for medical image segmentation? |
Danielle Ferreira et.al. |
2311.04847v1 |
null |
2023-11-08 |
Bayesian multi-band fitting of alerts for kilonovae detection |
Biswajit Biswas et.al. |
2311.04845v1 |
null |
2023-11-08 |
Hierarchically Gated Recurrent Neural Network for Sequence Modeling |
Zhen Qin et.al. |
2311.04823v1 |
link |
2023-11-08 |
A Lightweight Architecture for Real-Time Neuronal-Spike Classification |
Muhammad Ali Siddiqi et.al. |
2311.04808v1 |
null |
2023-11-08 |
Determination of toxic comments and unintended model bias minimization using Deep learning approach |
Md Azim Khan et.al. |
2311.04789v1 |
null |
2023-11-08 |
VioLA: Aligning Videos to 2D LiDAR Scans |
Jun-Jee Chao et.al. |
2311.04783v1 |
null |
2023-11-08 |
FetMRQC: an open-source machine learning framework for multi-centric fetal brain MRI quality control |
Thomas Sanchez et.al. |
2311.04780v1 |
link |
2023-11-08 |
GCS-ICHNet: Assessment of Intracerebral Hemorrhage Prognosis using Self-Attention with Domain Knowledge Integration |
Xuhao Shan et.al. |
2311.04772v1 |
link |
2023-11-08 |
An attention-based deep learning network for predicting Platinum resistance in ovarian cancer |
Haoming Zhuang et.al. |
2311.04769v1 |
null |
2023-11-08 |
Video Instance Matting |
Jiachen Li et.al. |
2311.04212v2 |
link |
2023-11-07 |
JPAVE: A Generation and Classification-based Model for Joint Product Attribute Prediction and Value Extraction |
Zhongfen Deng et.al. |
2311.04196v1 |
link |
2023-11-07 |
Linear to circular conversion in the polarized radio emission of a magnetar |
Marcus E. Lower et.al. |
2311.04195v1 |
null |
2023-11-07 |
SpaDeLeF: A Dataset for Hierarchical Classification of Lexical Functions for Collocations in Spanish |
Yevhen Kostiuk et.al. |
2311.04189v1 |
null |
2023-11-07 |
A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis |
Dipanjyoti Paul et.al. |
2311.04157v1 |
link |
2023-11-07 |
Galaxy Spectra neural Network (GaSNet). II. Using Deep Learning for Spectral Classification and Redshift Predictions |
Fucheng Zhong et.al. |
2311.04146v1 |
null |
2023-11-07 |
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models |
Shiwei Zhang et.al. |
2311.04145v1 |
null |
2023-11-07 |
Modelling Sentiment Analysis: LLMs and data augmentation techniques |
Guillem Senabre Prades et.al. |
2311.04139v1 |
null |
2023-11-07 |
Improved Topological Preservation in 3D Axon Segmentation and Centerline Detection using Geometric Assessment-driven Topological Smoothing (GATS) |
Nina I. Shamsi et.al. |
2311.04116v1 |
null |
2023-11-07 |
Joint modelling of recurrent and terminal events with discretely-distributed non-parametric frailty: application on re-hospitalizations and death in heart failure patients |
Chiara Masci et.al. |
2311.04103v1 |
null |
2023-11-06 |
A Classification of Graphs through Quadratic Embedding Constants and Clique Graph Insights |
Edy Tri Baskoro et.al. |
2311.03342v1 |
null |
2023-11-06 |
Tackling Concept Shift in Text Classification using Entailment-style Modeling |
Sumegh Roychowdhury et.al. |
2311.03320v1 |
null |
2023-11-06 |
A Foundation Model for Music Informatics |
Minz Won et.al. |
2311.03318v1 |
link |
2023-11-06 |
FATE: Feature-Agnostic Transformer-based Encoder for learning generalized embedding spaces in flow cytometry data |
Lisa Weijler et.al. |
2311.03314v1 |
link |
2023-11-06 |
A Single 2D Pose with Context is Worth Hundreds for 3D Human Pose Estimation |
Qitao Zhao et.al. |
2311.03312v1 |
null |
2023-11-06 |
Advancing Post Hoc Case Based Explanation with Feature Highlighting |
Eoin Kenny et.al. |
2311.03246v1 |
null |
2023-11-06 |
Machine Learning-Based Tea Leaf Disease Detection: A Comprehensive Review |
Faruk Ahmed et.al. |
2311.03240v1 |
null |
2023-11-06 |
Out-of-distribution Detection Learning with Unreliable Out-of-distribution Sources |
Haotian Zheng et.al. |
2311.03236v1 |
null |
2023-11-06 |
Segmentation of Drone Collision Hazards in Airborne RADAR Point Clouds Using PointNet |
Hector Arroyo et.al. |
2311.03221v1 |
null |
2023-11-06 |
Leveraging Transformers to Improve Breast Cancer Classification and Risk Assessment with Multi-modal and Longitudinal Data |
Yiqiu Shen et.al. |
2311.03217v1 |
null |
2023-11-03 |
LOTUS: Continual Imitation Learning for Robot Manipulation Through Unsupervised Skill Discovery |
Weikang Wan et.al. |
2311.02058v1 |
null |
2023-11-03 |
MetaFast: Enabling Fast Metagenomic Classification via Seed Counting and Edit Distance Approximation |
Arvid E. Gollwitzer et.al. |
2311.02029v1 |
null |
2023-11-03 |
A Structured Pruning Algorithm for Model-based Deep Learning |
Chicago Park et.al. |
2311.02003v1 |
null |
2023-11-03 |
Detection of keratoconus Diseases using deep Learning |
AKM Enzam-Ul Haque et.al. |
2311.01996v1 |
null |
2023-11-03 |
Obtaining Explainable Classification Models using Distributionally Robust Optimization |
Sanjeeb Dash et.al. |
2311.01994v1 |
null |
2023-11-03 |
Leveraging Large-Scale Pretrained Vision Foundation Models for Label-Efficient 3D Point Cloud Segmentation |
Shichao Dong et.al. |
2311.01989v1 |
null |
2023-11-06 |
RT-Trajectory: Robotic Task Generalization via Hindsight Trajectory Sketches |
Jiayuan Gu et.al. |
2311.01977v2 |
null |
2023-11-03 |
Welded graphs, Wirtinger groups and knotted punctured spheres |
Benjamin Audoux et.al. |
2311.01922v1 |
null |
2023-11-03 |
Contrast-Agnostic Groupwise Registration by Robust PCA for Quantitative Cardiac MRI |
Xinqi Li et.al. |
2311.01916v1 |
null |
2023-11-03 |
VQPy: An Object-Oriented Approach to Modern Video Analytics |
Shan Yu et.al. |
2311.01623v1 |
null |
2023-11-02 |
Tailoring Mixup to Data using Kernel Warping functions |
Quentin Bouniot et.al. |
2311.01434v1 |
link |
2023-11-02 |
Identifying Alzheimer Disease Dementia Levels Using Machine Learning Methods |
Md Gulzar Hussain et.al. |
2311.01428v1 |
null |
2023-11-02 |
Exploring Deep Learning Techniques for Glaucoma Detection: A Comprehensive Review |
Aized Amin Soofi et.al. |
2311.01425v1 |
null |
2023-11-02 |
Holistic Transfer: Towards Non-Disruptive Fine-Tuning with Partial Target Data |
Cheng-Hao Tu et.al. |
2311.01420v1 |
null |
2023-11-02 |
Learning to See Physical Properties with Active Sensing Motor Policies |
Gabriel B. Margolis et.al. |
2311.01405v1 |
null |
2023-11-02 |
Sim2Real Bilevel Adaptation for Object Surface Classification using Vision-Based Tactile Sensors |
Gabriele M. Caddeo et.al. |
2311.01380v1 |
link |
2023-11-02 |
Deep learning based Image Compression for Microscopy Images: An Empirical Study |
Yu Zhou et.al. |
2311.01352v1 |
null |
2023-11-02 |
Unreading Race: Purging Protected Features from Chest X-ray Embeddings |
Tobias Weber et.al. |
2311.01349v1 |
null |
2023-11-02 |
Scattering Vision Transformer: Spectral Mixing Matters |
Badri N. Patro et.al. |
2311.01310v1 |
null |
2023-11-02 |
Hybrid-Fusion Transformer for Multisequence MRI |
Jihoon Cho et.al. |
2311.01308v1 |
null |
2023-11-01 |
Software Repositories and Machine Learning Research in Cyber Security |
Mounika Vanamala et.al. |
2311.00691v1 |
null |
2023-11-01 |
What User Behaviors Make the Differences During the Process of Visual Analytics? |
Shahin Doroudian et.al. |
2311.00690v1 |
null |
2023-11-01 |
Deep Learning-Based Classification of Gamma Photon Interactions in Room-Temperature Semiconductor Radiation Detectors |
Sandeep K. Chaudhuri et.al. |
2311.00682v1 |
null |
2023-11-01 |
Latent Space Translation via Semantic Alignment |
Valentino Maiorca et.al. |
2311.00664v1 |
link |
2023-11-01 |
Rediscussion of eclipsing binaries. Paper XV. The B-type supergiant system V1765 Cygni |
John Southworth et.al. |
2311.00655v1 |
null |
2023-11-02 |
Emergence of Collective Open-Ended Exploration from Decentralized Meta-Reinforcement Learning |
Richard Bornemann et.al. |
2311.00651v2 |
null |
2023-11-01 |
Understanding the Issues and Causes in WebAssembly Application Development: A Mining-based Study |
Muhammad Waseem et.al. |
2311.00646v1 |
null |
2023-11-01 |
A Bi-level Framework for Traffic Accident Duration Prediction: Leveraging Weather and Road Condition Data within a Practical Optimum Pipeline |
Rafat Tabassum Sukonna et.al. |
2311.00634v1 |
null |
2023-11-01 |
Controllable Music Production with Diffusion Models and Guidance Gradients |
Mark Levy et.al. |
2311.00613v1 |
null |
2023-11-01 |
A Robust Deep Learning Method with Uncertainty Estimation for the Pathological Classification of Renal Cell Carcinoma based on CT Images |
Ni Yao et.al. |
2311.00567v1 |
null |
2023-10-31 |
Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked Autoencoders |
Srijan Das et.al. |
2310.20704v1 |
null |
2023-10-31 |
SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction |
Xinyuan Chen et.al. |
2310.20700v1 |
null |
2023-10-31 |
StairNet: Visual Recognition of Stairs for Human-Robot Locomotion |
Andrew Garrett Kurbis et.al. |
2310.20666v1 |
null |
2023-10-31 |
Performance Improvement in Multi-class Classification via Automated Hierarchy Generation and Exploitation through Extended LCPN Schemes |
Celal Alagoz et.al. |
2310.20641v1 |
null |
2023-10-31 |
Deepfake detection by exploiting surface anomalies: the SurFake approach |
Andrea Ciamarra et.al. |
2310.20621v1 |
null |
2023-10-31 |
Enhanced Synthetic MRI Generation from CT Scans Using CycleGAN with Feature Extraction |
Saba Nikbakhsh et.al. |
2310.20604v1 |
null |
2023-10-31 |
Finiteness properties for Shimura curves and modified diagonal cycles |
Congling Qiu et.al. |
2310.20600v1 |
null |
2023-10-31 |
Brain-like Flexible Visual Inference by Harnessing Feedback-Feedforward Alignment |
Tahereh Toosi et.al. |
2310.20599v1 |
link |
2023-10-31 |
Tracially Complete C-Algebras* |
José R. Carrión et.al. |
2310.20594v1 |
null |
2023-10-31 |
Strongly Magnetized Tidal Disruption Event Disks via Stream Injection in GRMHD |
Brandon Curd et.al. |
2310.20592v1 |
null |
2023-10-29 |
Improved Motor Imagery Classification Using Adaptive Spatial Filters Based on Particle Swarm Optimization Algorithm |
Xiong Xiong et.al. |
2310.19202v1 |
null |
2023-10-29 |
Enhancing Motor Imagery Decoding in Brain Computer Interfaces using Riemann Tangent Space Mapping and Cross Frequency Coupling |
Xiong Xiong et.al. |
2310.19198v1 |
null |
2023-10-29 |
A Survey on Watching Social Issue Videos among YouTube and TikTok Users |
Shuo Niu et.al. |
2310.19193v1 |
null |
2023-10-29 |
Subjective Quality Evaluation of Point Clouds Using a Head Mounted Display |
Joao Prazeres et.al. |
2310.19179v1 |
null |
2023-10-29 |
Robustifying Language Models with Test-Time Adaptation |
Noah Thomas McDermott et.al. |
2310.19177v1 |
null |
2023-10-29 |
Predicting recovery following stroke: deep learning, multimodal data and feature selection using explainable AI |
Adam White et.al. |
2310.19174v1 |
null |
2023-10-29 |
BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species Classification and Mapping |
Srikumar Sastry et.al. |
2310.19168v1 |
link |
2023-10-29 |
Unified Representation for Non-compositional and Compositional Expressions |
Ziheng Zeng et.al. |
2310.19127v1 |
null |
2023-10-29 |
Efficient IoT Inference via Context-Awareness |
Mohammad Mehdi Rastikerdar et.al. |
2310.19112v1 |
null |
2023-10-29 |
Pushdown Layers: Encoding Recursive Structure in Transformer Language Models |
Shikhar Murty et.al. |
2310.19089v1 |
null |
2023-10-27 |
Addressing GAN Training Instabilities via Tunable Classification Losses |
Monica Welfert et.al. |
2310.18291v1 |
null |
2023-10-27 |
PlantPlotGAN: A Physics-Informed Generative Adversarial Network for Plant Disease Prediction |
Felipe A. Lopes et.al. |
2310.18268v1 |
null |
2023-10-27 |
MalFake: A Multimodal Fake News Identification for Malayalam using Recurrent Neural Networks and VGG-16 |
Adhish S. Sujan et.al. |
2310.18263v1 |
null |
2023-10-27 |
Edge AI-Based Vein Detector for Efficient Venipuncture in the Antecubital Fossa |
Edwin Salcedo et.al. |
2310.18234v1 |
null |
2023-10-27 |
TBDLNet: a network for classifying multidrug-resistant and drug-sensitive tuberculosis |
Ziquan Zhu et.al. |
2310.18222v1 |
null |
2023-10-27 |
ArcheType: A Novel Framework for Open-Source Column Type Annotation using Large Language Models |
Benjamin Feuer et.al. |
2310.18208v1 |
link |
2023-10-27 |
Artifact-Robust Graph-Based Learning in Digital Pathology |
Saba Heidari Gheshlaghi et.al. |
2310.18192v1 |
null |
2023-10-27 |
Globular clusters and bar: captured or not captured? |
Anton A. Smirnov et.al. |
2310.18172v1 |
null |
2023-10-27 |
Style Description based Text-to-Speech with Conditional Prosodic Layer Normalization based Diffusion GAN |
Neeraj Kumar et.al. |
2310.18169v1 |
null |
2023-10-27 |
DESiRED -- Dynamic, Enhanced, and Smart iRED: A P4-AQM with Deep Reinforcement Learning and In-band Network Telemetry |
Leandro C. de Almeida et.al. |
2310.18159v1 |
null |
2023-10-26 |
A Coarse-to-Fine Pseudo-Labeling (C2FPL) Framework for Unsupervised Video Anomaly Detection |
Anas Al-lahham et.al. |
2310.17650v1 |
null |
2023-10-26 |
torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free Deep Learning Studies: A Case Study on NLP |
Yoshitomo Matsubara et.al. |
2310.17644v1 |
link |
2023-10-26 |
Drive Anywhere: Generalizable End-to-end Autonomous Driving with Multi-modal Foundation Models |
Tsun-Hsuan Wang et.al. |
2310.17642v1 |
null |
2023-10-26 |
Skew Products on the Berkovich Projective Line |
Richard A. P. Birkett et.al. |
2310.17628v1 |
null |
2023-10-26 |
A Survey on Transferability of Adversarial Examples across Deep Neural Networks |
Jindong Gu et.al. |
2310.17626v1 |
link |
2023-10-26 |
MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations |
Ajay Mandlekar et.al. |
2310.17596v1 |
null |
2023-10-26 |
Linear $x$-coordinate relations of triples on elliptic curves |
Jerson Caro et.al. |
2310.17592v1 |
null |
2023-10-26 |
A minimax optimal control approach for robust neural ODEs |
Cristina Cipriani et.al. |
2310.17584v1 |
null |
2023-10-26 |
BLIS-Net: Classifying and Analyzing Signals on Graphs |
Charles Xu et.al. |
2310.17579v1 |
null |
2023-10-26 |
Knots bounding non-isotopic ribbon disks |
Jeffrey Meier et.al. |
2310.17564v1 |
null |
2023-10-25 |
RDBench: ML Benchmark for Relational Databases |
Zizhao Zhang et.al. |
2310.16837v1 |
link |
2023-10-25 |
TD-MPC2: Scalable, Robust World Models for Continuous Control |
Nicklas Hansen et.al. |
2310.16828v1 |
null |
2023-10-26 |
Deep machine learning for meteor monitoring: advances with transfer learning and gradient-weighted class activation mapping |
Eloy Peña-Asensio et.al. |
2310.16826v2 |
null |
2023-10-25 |
Uncovering a new group of T Tauri stars in the Taurus-Auriga molecular complex from Gaia and GALEX data |
Ana Inés Gómez de Castro et.al. |
2310.16820v1 |
null |
2023-10-25 |
Using Diffusion Models to Generate Synthetic Labelled Data for Medical Image Segmentation |
Daniel Saragih et.al. |
2310.16794v1 |
null |
2023-10-25 |
Navigating Socio-Emotional Risk through Comfort-Building in a Physics Teaching Community of Practice: A Case Study |
Maggie Mahmood et.al. |
2310.16778v1 |
null |
2023-10-25 |
IntenDD: A Unified Contrastive Learning Approach for Intent Detection and Discovery |
Bhavuk Singhal et.al. |
2310.16761v1 |
null |
2023-10-25 |
Interferometric Neural Networks |
Arun Sehrawat et.al. |
2310.16742v1 |
link |
2023-10-25 |
A No-Reference Quality Assessment Method for Digital Human Head |
Yingjie Zhou et.al. |
2310.16732v1 |
null |
2023-10-25 |
Spherical Wavefront Near-Field DoA Estimation in THz Automotive Radar |
Ahmet M. Elbir et.al. |
2310.16724v1 |
null |
2023-10-24 |
From Posterior Sampling to Meaningful Diversity in Image Restoration |
Noa Cohen et.al. |
2310.16047v1 |
null |
2023-10-24 |
Finetuning Offline World Models in the Real World |
Yunhai Feng et.al. |
2310.16029v1 |
null |
2023-10-24 |
Human-in-the-Loop Task and Motion Planning for Imitation Learning |
Ajay Mandlekar et.al. |
2310.16014v1 |
null |
2023-10-24 |
CVPR 2023 Text Guided Video Editing Competition |
Jay Zhangjie Wu et.al. |
2310.16003v1 |
null |
2023-10-24 |
Vision-Language Pseudo-Labels for Single-Positive Multi-Label Learning |
Xin Xing et.al. |
2310.15985v1 |
link |
2023-10-24 |
Geometry-Aware Video Quality Assessment for Dynamic Digital Human |
Zicheng Zhang et.al. |
2310.15984v1 |
null |
2023-10-24 |
Minimax Forward and Backward Learning of Evolving Tasks with Performance Guarantees |
Verónica Álvarez et.al. |
2310.15974v1 |
link |
2023-10-24 |
Decoupled DETR: Spatially Disentangling Localization and Classification for Improved End-to-End Object Detection |
Manyuan Zhang et.al. |
2310.15955v1 |
null |
2023-10-25 |
Improving Robustness and Reliability in Medical Image Classification with Latent-Guided Diffusion and Nested-Ensembles |
Xing Shen et.al. |
2310.15952v2 |
null |
2023-10-24 |
ShARc: Shape and Appearance Recognition for Person Identification In-the-wild |
Haidong Zhu et.al. |
2310.15946v1 |
null |
2023-10-23 |
FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling |
Haonan Qiu et.al. |
2310.15169v1 |
null |
2023-10-23 |
Bitrate Ladder Prediction Methods for Adaptive Video Streaming: A Review and Benchmark |
Ahmed Telili et.al. |
2310.15163v1 |
null |
2023-10-23 |
Linear Representations of Sentiment in Large Language Models |
Curt Tigges et.al. |
2310.15154v1 |
null |
2023-10-23 |
Unlocking the Transferability of Tokens in Deep Models for Tabular Data |
Qi-Le Zhou et.al. |
2310.15149v1 |
null |
2023-10-23 |
When Should the FDA Inspect Pharmaceutical Manufacturing Facilities to Better Mitigate Drug Shortages? |
Daniel Kosmas et.al. |
2310.15146v1 |
null |
2023-10-23 |
Novel-View Acoustic Synthesis from 3D Reconstructed Rooms |
Byeongjoo Ahn et.al. |
2310.15130v1 |
link |
2023-10-23 |
Open-Ended Instructable Embodied Agents with Memory-Augmented Large Language Models |
Gabriel Sarch et.al. |
2310.15127v1 |
null |
2023-10-23 |
SpVOS: Efficient Video Object Segmentation with Triple Sparse Convolution |
Weihao Lin et.al. |
2310.15115v1 |
null |
2023-10-23 |
The Self 2.0: How AI-Enhanced Self-Clones Transform Self-Perception and Improve Presentation Skills |
Qingxiao Zheng et.al. |
2310.15112v1 |
null |
2023-10-23 |
Matryoshka Diffusion Models |
Jiatao Gu et.al. |
2310.15111v1 |
null |
2023-10-20 |
Using Human-like Mechanism to Weaken Effect of Pre-training Weight Bias in Face-Recognition Convolutional Neural Network |
Haojiang Ying et.al. |
2310.13674v1 |
null |
2023-10-23 |
Explainable Depression Symptom Detection in Social Media |
Eliseo Bao Souto et.al. |
2310.13664v2 |
null |
2023-10-20 |
Arabic Dialect Identification under Scrutiny: Limitations of Single-label Classification |
Amr Keleg et.al. |
2310.13661v1 |
link |
2023-10-20 |
Optimal Transport for Measures with Noisy Tree Metric |
Tam Le et.al. |
2310.13653v1 |
null |
2023-10-20 |
Principal $2$-blocks with wreathed defect groups up to splendid Morita equivalence |
Shigeo Koshitani et.al. |
2310.13621v1 |
null |
2023-10-20 |
Skin Lesion Segmentation Improved by Transformer-based Networks with Inter-scale Dependency Modeling |
Sania Eskandari et.al. |
2310.13604v1 |
link |
2023-10-20 |
Classification of quantum states of light using random measurements through a multimode fiber |
Saroch Leedumrongwatthanakun et.al. |
2310.13599v1 |
null |
2023-10-20 |
Longer-range Contextualized Masked Autoencoder |
Taekyung Kim et.al. |
2310.13593v1 |
null |
2023-10-20 |
POTLoc: Pseudo-Label Oriented Transformer for Point-Supervised Temporal Action Localization |
Elahe Vahdani et.al. |
2310.13585v1 |
null |
2023-10-20 |
Progressive Dual Priori Network for Generalized Breast Tumor Segmentation |
Li Wang et.al. |
2310.13574v1 |
null |
2023-10-19 |
Putting the Object Back into Video Object Segmentation |
Ho Kei Cheng et.al. |
2310.12982v1 |
link |
2023-10-19 |
Variational Inference for SDEs Driven by Fractional Noise |
Rembert Daems et.al. |
2310.12975v1 |
null |
2023-10-19 |
Frozen Transformers in Language Models Are Effective Visual Encoder Layers |
Ziqi Pang et.al. |
2310.12973v1 |
link |
2023-10-19 |
Bialgebra structures on flat Lie algebras |
Amine Bahayou et.al. |
2310.12966v1 |
null |
2023-10-19 |
End-to-End Delay Minimization based on Joint Optimization of DNN Partitioning and Resource Allocation for Cooperative Edge Inference |
Xinrui Ye et.al. |
2310.12937v1 |
null |
2023-10-19 |
Digital Twin-Enabled Intelligent DDoS Detection Mechanism for Autonomous Core Networks |
Yagmur Yigit et.al. |
2310.12924v1 |
null |
2023-10-19 |
Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning |
Juan Rocamonde et.al. |
2310.12921v1 |
null |
2023-10-19 |
Unsupervised Object Localization in the Era of Self-Supervised ViTs: A Survey |
Oriane Siméoni et.al. |
2310.12904v1 |
link |
2023-10-19 |
A Markovian dynamics for $C. elegans$ behavior across scales |
Antonio C. Costa et.al. |
2310.12883v1 |
link |
2023-10-19 |
Perceptual Assessment and Optimization of High Dynamic Range Image Rendering |
Peibei Cao et.al. |
2310.12877v1 |
null |
2023-10-18 |
SHARCS: Efficient Transformers through Routing with Dynamic Width Sub-networks |
Mohammadreza Salehi et.al. |
2310.12126v1 |
null |
2023-10-18 |
Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture |
Daniel Y. Fu et.al. |
2310.12109v1 |
null |
2023-10-18 |
HSTR-Net: Reference Based Video Super-resolution for Aerial Surveillance with Dual Cameras |
H. Umut Suluhan et.al. |
2310.12092v1 |
null |
2023-10-18 |
Chemical Analysis of the Brightest Star of the Cetus II Ultra-Faint Dwarf Galaxy Candidate |
K. B. Webber et.al. |
2310.12090v1 |
null |
2023-10-18 |
One-Shot Imitation Learning: A Pose Estimation Perspective |
Pietro Vitiello et.al. |
2310.12077v1 |
null |
2023-10-18 |
Exploring Fairness in Pre-trained Visual Transformer based Natural and GAN Generated Image Detection Systems and Understanding the Impact of Image Compression in Fairness |
Manjary P. Gangan et.al. |
2310.12076v1 |
null |
2023-10-18 |
Black-Box Training Data Identification in GANs via Detector Networks |
Lukman Olagoke et.al. |
2310.12063v1 |
null |
2023-10-19 |
Robust Class-Conditional Distribution Alignment for Partial Domain Adaptation |
Sandipan Choudhuri et.al. |
2310.12060v2 |
null |
2023-10-18 |
Exact and efficient solutions of the LMC Multitask Gaussian Process model |
Olivier Truffinet et.al. |
2310.12032v1 |
link |
2023-10-18 |
CORE: A Few-Shot Company Relation Classification Dataset for Robust Domain Adaptation |
Philipp Borchert et.al. |
2310.12024v1 |
link |
2023-10-17 |
DELIFFAS: Deformable Light Fields for Fast Avatar Synthesis |
Youngjoong Kwon et.al. |
2310.11449v1 |
null |
2023-10-18 |
4K4D: Real-Time 4D View Synthesis at 4K Resolution |
Zhen Xu et.al. |
2310.11448v2 |
null |
2023-10-18 |
EvalCrafter: Benchmarking and Evaluating Large Video Generation Models |
Yaofang Liu et.al. |
2310.11440v2 |
null |
2023-10-17 |
Transitive generalized toggle groups containing a cycle |
Jonathan S. Bloom et.al. |
2310.11387v1 |
null |
2023-10-17 |
DialogueLLM: Context and Emotion Knowledge-Tuned LLaMA Models for Emotion Recognition in Conversations |
Yazhou Zhang et.al. |
2310.11374v1 |
null |
2023-10-17 |
VECHR: A Dataset for Explainable and Robust Classification of Vulnerability Type in the European Court of Human Rights |
Shanshan Xu et.al. |
2310.11368v1 |
null |
2023-10-17 |
Lie Group Decompositions for Equivariant Neural Networks |
Mircea Mironenco et.al. |
2310.11366v1 |
null |
2023-10-17 |
Hybrid quantum-classical graph neural networks for tumor classification in digital pathology |
Anupama Ray et.al. |
2310.11353v1 |
null |
2023-10-17 |
The effect of stemming and lemmatization on Portuguese fake news text classification |
Lucca de Freitas Santos et.al. |
2310.11344v1 |
null |
2023-10-17 |
Influencing factors on false positive rates when classifying tumor cell line response to drug treatment |
Priyanka Vasanthakumari et.al. |
2310.11329v1 |
null |
2023-10-16 |
A Survey on Video Diffusion Models |
Zhen Xing et.al. |
2310.10647v1 |
link |
2023-10-16 |
Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting |
Zeyu Yang et.al. |
2310.10642v1 |
link |
2023-10-16 |
Zero-Shot Robotic Manipulation with Pretrained Image-Editing Diffusion Models |
Kevin Black et.al. |
2310.10639v1 |
null |
2023-10-16 |
Efficacy of Dual-Encoders for Extreme Multi-Label Classification |
Nilesh Gupta et.al. |
2310.10636v1 |
null |
2023-10-16 |
Overcoming the Rayleigh limit in extremely low SNR |
Hyunsoo Choi et.al. |
2310.10633v1 |
null |
2023-10-16 |
Video Language Planning |
Yilun Du et.al. |
2310.10625v1 |
null |
2023-10-16 |
DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing |
Jia-Wei Liu et.al. |
2310.10624v1 |
null |
2023-10-16 |
BiLL-VTG: Bridging Large Language Models and Lightweight Visual Tools for Video-based Texts Generation |
Ji Qi et.al. |
2310.10586v1 |
null |
2023-10-16 |
RefConv: Re-parameterized Refocusing Convolution for Powerful ConvNets |
Zhicheng Cai et.al. |
2310.10563v1 |
link |
2023-10-16 |
Deep learning applied to EEG data with different montages using spatial attention |
Dung Truong et.al. |
2310.10550v1 |
null |
2023-10-13 |
An Unbiased Look at Datasets for Visuo-Motor Pre-Training |
Sudeep Dasari et.al. |
2310.09289v1 |
null |
2023-10-13 |
Disentangled Latent Spaces Facilitate Data-Driven Auxiliary Learning |
Geri Skenderi et.al. |
2310.09278v1 |
null |
2023-10-13 |
A Hybrid Approach for Depression Classification: Random Forest-ANN Ensemble on Motor Activity Signals |
Anket Patil et.al. |
2310.09277v1 |
null |
2023-10-13 |
PromptRE: Weakly-Supervised Document-Level Relation Extraction via Prompting-Based Data Programming |
Chufan Gao et.al. |
2310.09265v1 |
null |
2023-10-13 |
Political claim identification and categorization in a multilingual setting: First experiments |
Urs Zaberer et.al. |
2310.09256v1 |
null |
2023-10-13 |
It's an Alignment, Not a Trade-off: Revisiting Bias and Variance in Deep Models |
Lin Chen et.al. |
2310.09250v1 |
null |
2023-10-13 |
A Multifaceted Look at Starlink Performance |
Nitinder Mohan et.al. |
2310.09242v1 |
null |
2023-10-13 |
Time CNN and Graph Convolution Network for Epileptic Spike Detection in MEG Data |
Pauline Mouches et.al. |
2310.09236v1 |
null |
2023-10-13 |
Ultrasound Image Segmentation of Thyroid Nodule via Latent Semantic Feature Co-Registration |
Xuewei Li et.al. |
2310.09221v1 |
null |
2023-10-13 |
PaLI-3 Vision Language Models: Smaller, Faster, Stronger |
Xi Chen et.al. |
2310.09199v1 |
null |
2023-10-12 |
Octopus: Embodied Vision-Language Programmer from Environmental Feedback |
Jingkang Yang et.al. |
2310.08588v1 |
link |
2023-10-12 |
Is Generalized Dynamic Novel View Synthesis from Monocular Videos Possible Today? |
Xiaoming Zhao et.al. |
2310.08587v1 |
null |
2023-10-12 |
Im4D: High-Fidelity and Real-Time Novel View Synthesis for Dynamic Scenes |
Haotong Lin et.al. |
2310.08585v1 |
null |
2023-10-12 |
Is ImageNet worth 1 video? Learning strong image encoders from 1 long unlabelled video |
Shashanka Venkataramanan et.al. |
2310.08584v1 |
null |
2023-10-12 |
Universal Visual Decomposer: Long-Horizon Manipulation Made Easy |
Zichen Zhang et.al. |
2310.08581v1 |
null |
2023-10-12 |
Learning to Act from Actionless Videos through Dense Correspondences |
Po-Chen Ko et.al. |
2310.08576v1 |
null |
2023-10-12 |
Effective isometries of periodic shells |
Hussein Nassar et.al. |
2310.08531v1 |
null |
2023-10-12 |
LLM-augmented Preference Learning from Natural Language |
Inwon Kang et.al. |
2310.08523v1 |
null |
2023-10-12 |
Impact of time and note duration tokenizations on deep learning symbolic music modeling |
Nathan Fradet et.al. |
2310.08497v1 |
link |
2023-10-12 |
GraphextQA: A Benchmark for Evaluating Graph-Enhanced Large Language Models |
Yuanchun Shen et.al. |
2310.08487v1 |
link |
2023-10-11 |
ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models |
Yingqing He et.al. |
2310.07702v1 |
link |
2023-10-11 |
ConditionVideo: Training-Free Condition-Guided Text-to-Video Generation |
Bo Peng et.al. |
2310.07697v1 |
null |
2023-10-11 |
Large-scale photonic computing with nonlinear disordered media |
Hao Wang et.al. |
2310.07690v1 |
null |
2023-10-11 |
Deep Video Inpainting Guided by Audio-Visual Self-Supervision |
Kyuyeon Kim et.al. |
2310.07663v1 |
null |
2023-10-11 |
Hypercomplex Multimodal Emotion Recognition from EEG and Peripheral Physiological Signals |
Eleonora Lopez et.al. |
2310.07648v1 |
null |
2023-10-11 |
Attention-Map Augmentation for Hypercomplex Breast Cancer Classification |
Eleonora Lopez et.al. |
2310.07633v1 |
null |
2023-10-11 |
Differentiable Euler Characteristic Transforms for Shape Classification |
Ernst Roell et.al. |
2310.07630v1 |
link |
2023-10-11 |
Time-Resolved Reconstruction of Motion, Force, and Stiffness using Spectro-Dynamic MRI |
Max H. C. van Riel et.al. |
2310.07622v1 |
null |
2023-10-11 |
Reinforcement Learning-based Knowledge Graph Reasoning for Explainable Fact-checking |
Gustav Nikopensius et.al. |
2310.07613v1 |
null |
2023-10-11 |
QACHECK: A Demonstration System for Question-Guided Multi-Hop Fact-Checking |
Liangming Pan et.al. |
2310.07609v1 |
link |
2023-10-10 |
Convivial Solipsism as a maximally perspectival interpretation |
Herve Zwirn et.al. |
2310.06815v1 |
null |
2023-10-10 |
A Supervised Embedding and Clustering Anomaly Detection method for classification of Mobile Network Faults |
R. Mosayebi et.al. |
2310.06779v1 |
null |
2023-10-10 |
Optical assembly of nanostructures mediated by surface roughness |
Robert G. Felsted et.al. |
2310.06774v1 |
null |
2023-10-10 |
Uni3D: Exploring Unified 3D Representation at Scale |
Junsheng Zhou et.al. |
2310.06773v1 |
link |
2023-10-10 |
Improved convergence rates for some kernel random forest algorithms |
Isidoros Iakovidis et.al. |
2310.06760v1 |
null |
2023-10-10 |
Geographic Location Encoding with Spherical Harmonics and Sinusoidal Representation Networks |
Marc Rußwurm et.al. |
2310.06743v1 |
link |
2023-10-10 |
Multi-domain improves out-of-distribution and data-limited scenarios for medical image analysis |
Ece Ozkan et.al. |
2310.06737v1 |
null |
2023-10-10 |
S4Sleep: Elucidating the design space of deep-learning-based sleep stage classification models |
Tiezhi Wang et.al. |
2310.06715v1 |
link |
2023-10-10 |
Tertiary Lymphoid Structures Generation through Graph-based Diffusion |
Manuel Madeira et.al. |
2310.06661v1 |
null |
2023-10-10 |
Assessing the Impact of a Supervised Classification Filter on Flow-based Hybrid Network Anomaly Detection |
Dominik Macko et.al. |
2310.06656v1 |
link |
2023-10-09 |
FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing |
Yuren Cong et.al. |
2310.05922v1 |
null |
2023-10-09 |
Enumerating Calabi-Yau Manifolds: Placing bounds on the number of diffeomorphism classes in the Kreuzer-Skarke list |
Aditi Chandra et.al. |
2310.05909v1 |
null |
2023-10-09 |
ViCor: Bridging Visual Understanding and Commonsense Reasoning with Large Language Models |
Kaiwen Zhou et.al. |
2310.05872v1 |
null |
2023-10-10 |
Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models |
Guangzhi Sun et.al. |
2310.05863v2 |
link |
2023-10-09 |
Latent Wander: an Alternative Interface for Interactive and Serendipitous Discovery of Large AV Archives |
Yuchen Yang et.al. |
2310.05835v1 |
null |
2023-10-09 |
Write What You Want: Applying Text-to-video Retrieval to Audiovisual Archives |
Yuchen Yang et.al. |
2310.05825v1 |
null |
2023-10-09 |
Dipole-Spread Function Engineering for 6D Super-Resolution Microscopy |
Tingting Wu et.al. |
2310.05810v1 |
null |
2023-10-09 |
A Simple Open-Loop Baseline for Reinforcement Learning Locomotion Tasks |
Antonin Raffin et.al. |
2310.05808v1 |
null |
2023-10-09 |
Learning Language-guided Adaptive Hyper-modality Representation for Multimodal Sentiment Analysis |
Haoyu Zhang et.al. |
2310.05804v1 |
null |
2023-10-10 |
Two-timescale Derivative Free Optimization for Performative Prediction with Markovian Data |
Haitong Liu et.al. |
2310.05792v2 |
null |
2023-10-06 |
Exploiting Transformer Activation Sparsity with Dynamic Inference |
Mikołaj Piórczyński et.al. |
2310.04361v1 |
null |
2023-10-06 |
SwimXYZ: A large-scale dataset of synthetic swimming motions and videos |
Fiche Guénolé et.al. |
2310.04360v1 |
null |
2023-10-06 |
Large-Scale Korean Text Dataset for Classifying Biased Speech in Real-World Online Services |
Dasol Choi et.al. |
2310.04313v1 |
null |
2023-10-06 |
Convergent ADMM Plug and Play PET Image Reconstruction |
Florent Sureau et.al. |
2310.04299v1 |
null |
2023-10-06 |
A Plug-and-Play Image Registration Network |
Junhao Hu et.al. |
2310.04297v1 |
null |
2023-10-06 |
Towards Non-contact 3D Ultrasound for Wrist Imaging |
Antony Jerald et.al. |
2310.04296v1 |
null |
2023-10-06 |
Spectroscopic variability of massive pre-main-sequence stars in M17 |
A. R. Derkink et.al. |
2310.04287v1 |
null |
2023-10-06 |
Multi-Industry Simplex : A Probabilistic Extension of GICS |
Maksim Papenkov et.al. |
2310.04280v1 |
null |
2023-10-06 |
Bringing Quantum Algorithms to Automated Machine Learning: A Systematic Review of AutoML Frameworks Regarding Extensibility for QML Algorithms |
Dennis Klau et.al. |
2310.04238v1 |
null |
2023-10-06 |
Written and spoken corpus of real and fake social media postings about COVID-19 |
Ng Bee Chin et.al. |
2310.04237v1 |
null |
2023-10-05 |
The Un-Kidnappable Robot: Acoustic Localization of Sneaking People |
Mengyu Yang et.al. |
2310.03743v1 |
null |
2023-10-05 |
Agent Instructs Large Language Models to be General Zero-Shot Reasoners |
Nicholas Crispino et.al. |
2310.03710v1 |
link |
2023-10-05 |
OMG-ATTACK: Self-Supervised On-Manifold Generation of Transferable Evasion Attacks |
Ofir Bar Tal et.al. |
2310.03707v1 |
null |
2023-10-05 |
Role of Spatial Coherence in Diffractive Optical Neural Networks |
Matthew J. Filipovich et.al. |
2310.03679v1 |
null |
2023-10-05 |
Certification of Deep Learning Models for Medical Image Segmentation |
Othmane Laousy et.al. |
2310.03664v1 |
null |
2023-10-05 |
Autoregressive Coefficients based Intelligent Protection of Transmission Lines Connected to Type-3 Wind Farms |
Pallav Kumar Bera et.al. |
2310.03663v1 |
null |
2023-10-05 |
Robustness-Guided Image Synthesis for Data-Free Quantization |
Jianhong Bai et.al. |
2310.03661v1 |
null |
2023-10-05 |
Balancing Autonomy and Alignment: A Multi-Dimensional Taxonomy for Autonomous LLM-powered Multi-Agent Architectures |
Thorsten Händler et.al. |
2310.03659v1 |
null |
2023-10-05 |
Strategic Evaluation: Subjects, Evaluators, and Society |
Benjamin Laufer et.al. |
2310.03655v1 |
null |
2023-10-05 |
CLEVRER-Humans: Describing Physical and Causal Events the Human Way |
Jiayuan Mao et.al. |
2310.03635v1 |
null |
2023-10-04 |
SemiReward: A General Reward Model for Semi-supervised Learning |
Siyuan Li et.al. |
2310.03013v1 |
link |
2023-10-04 |
High-dimensional SGD aligns with emerging outlier eigenspaces |
Gerard Ben Arous et.al. |
2310.03010v1 |
null |
2023-10-05 |
IBCL: Zero-shot Model Generation for Task Trade-offs in Continual Learning |
Pengyuan Lu et.al. |
2310.02995v2 |
link |
2023-10-04 |
Multiple Physics Pretraining for Physical Surrogate Models |
Michael McCabe et.al. |
2310.02994v1 |
null |
2023-10-04 |
UniverSLU: Universal Spoken Language Understanding for Diverse Classification and Sequence Generation Tasks with a Single Network |
Siddhant Arora et.al. |
2310.02973v1 |
null |
2023-10-04 |
Fully Automatic Segmentation of Gross Target Volume and Organs-at-Risk for Radiotherapy Planning of Nasopharyngeal Carcinoma |
Mehdi Astaraki et.al. |
2310.02972v1 |
null |
2023-10-04 |
Prompting and Adapter Tuning for Self-supervised Encoder-Decoder Speech Model |
Kai-Wei Chang et.al. |
2310.02971v1 |
null |
2023-10-05 |
Co-modeling the Sequential and Graphical Routes for Peptide Representation Learning |
Zihan Liu et.al. |
2310.02964v2 |
link |
2023-10-04 |
CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection |
Yang Cao et.al. |
2310.02960v1 |
link |
2023-10-04 |
HappyFeat -- An interactive and efficient BCI framework for clinical applications |
Arthur Desbois et.al. |
2310.02948v1 |
null |
2023-10-03 |
DREAM: Visual Decoding from Reversing Human Visual System |
Weihao Xia et.al. |
2310.02265v1 |
null |
2023-10-03 |
RSRD: A Road Surface Reconstruction Dataset and Benchmark for Safe and Comfortable Autonomous Driving |
Tong Zhao et.al. |
2310.02262v1 |
null |
2023-10-03 |
Harnessing Pre-Trained Sentence Transformers for Offensive Language Detection in Indian Languages |
Ananya Joshi et.al. |
2310.02249v1 |
null |
2023-10-04 |
Tensor Programs VI: Feature Learning in Infinite-Depth Neural Networks |
Greg Yang et.al. |
2310.02244v2 |
null |
2023-10-03 |
MIS-AVioDD: Modality Invariant and Specific Representation for Audio-Visual Deepfake Detection |
Vinaya Sree Katamneni et.al. |
2310.02234v1 |
null |
2023-10-03 |
HoloNets: Spectral Convolutions do extend to Directed Graphs |
Christian Koke et.al. |
2310.02232v1 |
null |
2023-10-03 |
Extraction of Medication and Temporal Relation from Clinical Text by Harnessing Different Deep Learning Models |
Hangyu Tu et.al. |
2310.02229v1 |
null |
2023-10-03 |
Symmetry-based classification of exact flat bands in single and bilayer moiré systems |
Siddhartha Sarkar et.al. |
2310.02218v1 |
null |
2023-10-03 |
Learnable Data Augmentation for One-Shot Unsupervised Domain Adaptation |
Julio Ivan Davila Carrazco et.al. |
2310.02201v1 |
null |
2023-10-03 |
CNN photometric redshifts in the SDSS at $r\leq 20$ |
M. Treyer et.al. |
2310.02173v1 |
null |
2023-09-29 |
A Large Language Model Approach to Educational Survey Feedback Analysis |
Michael J. Parker et.al. |
2309.17447v1 |
null |
2023-10-02 |
LLM-grounded Video Diffusion Models |
Long Lian et.al. |
2309.17444v2 |
null |
2023-09-29 |
Classification of Potholes Based on Surface Area Using Pre-Trained Models of Convolutional Neural Network |
Chauhdary Fazeel Ahmad et.al. |
2309.17426v1 |
null |
2023-09-29 |
CNN-based automatic segmentation of Lumen & Media boundaries in IVUS images using closed polygonal chains |
Pavel Sinha et.al. |
2309.17406v1 |
null |
2023-09-29 |
AV-CPL: Continuous Pseudo-Labeling for Audio-Visual Speech Recognition |
Andrew Rouditchenko et.al. |
2309.17395v1 |
null |
2023-09-29 |
Tree Cross Attention |
Leo Feng et.al. |
2309.17388v1 |
null |
2023-09-29 |
Adversarial Imitation Learning from Visual Observations using Latent Information |
Vittorio Giammarino et.al. |
2309.17371v1 |
link |
2023-09-29 |
SpinView: General interactive visual analysis tool for multiscale computational magnetism |
Qichen Xu et.al. |
2309.17367v1 |
null |
2023-09-29 |
Asynchronous Graph Generators |
Christopher P. Ley et.al. |
2309.17335v1 |
null |
2023-09-29 |
Multi-Depth Branches Network for Efficient Image Super-Resolution |
Huiyuan Tian et.al. |
2309.17334v1 |
link |
2023-09-29 |
Demystifying CLIP Data |
Hu Xu et.al. |
2309.16671v2 |
link |
2023-09-28 |
Decaf: Monocular Deformation Capture for Face and Hand Interactions |
Soshi Shimada et.al. |
2309.16670v1 |
null |
2023-09-28 |
Training a Large Video Model on a Single Machine in a Day |
Yue Zhao et.al. |
2309.16669v1 |
link |
2023-09-28 |
Novel Deep Learning Pipeline for Automatic Weapon Detection |
Haribharathi Sivakumar et.al. |
2309.16654v1 |
null |
2023-09-28 |
ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning |
Qiao Gu et.al. |
2309.16650v1 |
null |
2023-09-29 |
Mixup Your Own Pairs |
Yilei Wu et.al. |
2309.16633v2 |
link |
2023-09-28 |
Class Activation Map-based Weakly supervised Hemorrhage Segmentation using Resnet-LSTM in Non-Contrast Computed Tomography images |
Shreyas H Ramananda et.al. |
2309.16627v1 |
null |
2023-09-28 |
The twisting index in semitoric systems |
Jaume Alonso et.al. |
2309.16614v1 |
null |
2023-09-28 |
Exploiting Edge Features in Graphs with Fused Network Gromov-Wasserstein Distance |
Junjie Yang et.al. |
2309.16604v1 |
null |
2023-09-28 |
Can LLMs Effectively Leverage Structural Information for Graph Learning: When and Why |
Jin Huang et.al. |
2309.16595v1 |
null |
2023-09-27 |
SHACIRA: Scalable HAsh-grid Compression for Implicit Neural Representations |
Sharath Girish et.al. |
2309.15848v1 |
null |
2023-09-27 |
Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing |
Brian Yan et.al. |
2309.15826v1 |
null |
2023-09-27 |
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation |
David Junhao Zhang et.al. |
2309.15818v1 |
link |
2023-09-27 |
Convolutional Networks with Oriented 1D Kernels |
Alexandre Kirchmeyer et.al. |
2309.15812v1 |
link |
2023-09-27 |
A Quantum-Classical Hybrid Block-Matching Algorithm in Noisy Environment using Dissimilarity Measure |
M. Martínez-Felipe et.al. |
2309.15792v1 |
null |
2023-09-27 |
Large Language Model Routing with Benchmark Datasets |
Tal Shnitzer et.al. |
2309.15789v1 |
null |
2023-09-27 |
One For All: Video Conversation is Feasible Without Video Instruction Tuning |
Ruyang Liu et.al. |
2309.15785v1 |
null |
2023-09-27 |
Rapid Network Adaptation: Learning to Adapt Neural Networks Using Test-Time Feedback |
Teresa Yeo et.al. |
2309.15762v1 |
null |
2023-09-27 |
Automated CT Lung Cancer Screening Workflow using 3D Camera |
Brian Teixeira et.al. |
2309.15750v1 |
null |
2023-09-27 |
Data-Driven Latent Space Representation for Robust Bipedal Locomotion Learning |
Guillermo A. Castillo et.al. |
2309.15740v1 |
null |
2023-09-26 |
Classification of symmetry-enriched topological quantum spin liquids |
Weicheng Ye et.al. |
2309.15118v1 |
null |
2023-09-26 |
Doduo: Learning Dense Visual Correspondence from Unsupervised Semantic-Aware Flow |
Zhenyu Jiang et.al. |
2309.15110v1 |
null |
2023-09-27 |
LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models |
Yaohui Wang et.al. |
2309.15103v2 |
null |
2023-09-26 |
VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning |
Han Lin et.al. |
2309.15091v1 |
null |
2023-09-26 |
Video-adverb retrieval with compositional adverb-action embeddings |
Thomas Hummel et.al. |
2309.15086v1 |
null |
2023-09-26 |
Challenges of building medical image datasets for development of deep learning software in stroke |
Alessandro Fontanella et.al. |
2309.15081v1 |
null |
2023-09-26 |
On Excess Risk Convergence Rates of Neural Network Classifiers |
Hyunouk Ko et.al. |
2309.15075v1 |
null |
2023-09-26 |
Language-EXtended Indoor SLAM (LEXIS): A Versatile System for Real-time Visual Scene Understanding |
Christina Kassab et.al. |
2309.15065v1 |
null |
2023-09-26 |
QUILT: Effective Multi-Class Classification on Quantum Computers Using an Ensemble of Diverse Quantum Classifiers |
Daniel Silver et.al. |
2309.15056v1 |
null |
2023-09-26 |
Thalamic nuclei segmentation from T$_1$-weighted MRI: unifying and benchmarking state-of-the-art methods with young and old cohorts |
Brendan Williams et.al. |
2309.15053v1 |
null |
2023-09-25 |
Extreme Parkour with Legged Robots |
Xuxin Cheng et.al. |
2309.14341v1 |
null |
2023-09-25 |
Chop & Learn: Recognizing and Generating Object-State Compositions |
Nirat Saini et.al. |
2309.14339v1 |
null |
2023-09-25 |
Human-Assisted Continual Robot Learning with Foundation Models |
Meenal Parakh et.al. |
2309.14321v1 |
null |
2023-09-25 |
MUTEX: Learning Unified Policies from Multimodal Task Specifications |
Rutav Shah et.al. |
2309.14320v1 |
null |
2023-09-25 |
DeepMesh: Mesh-based Cardiac Motion Tracking using Deep Learning |
Qingjie Meng et.al. |
2309.14306v1 |
null |
2023-09-25 |
NAS-NeRF: Generative Neural Architecture Search for Neural Radiance Fields |
Saeejith Nair et.al. |
2309.14293v1 |
null |
2023-09-25 |
CLIP-DIY: CLIP Dense Inference Yields Open-Vocabulary Semantic Segmentation For-Free |
Monika Wysoczańska et.al. |
2309.14289v1 |
null |
2023-09-25 |
Comparison of One- Two- and Three- Dimensional CNN models for Drawing-Test-Based Diagnostics of the Parkinson's Disease |
Xuechao Wang et.al. |
2309.14288v1 |
null |
2023-09-26 |
Virtual Hyperspectral Images Using Symmetric Autoencoders |
Archisman Bhattacharjee et.al. |
2309.14286v2 |
null |
2023-09-25 |
OmniEvent: A Comprehensive, Fair, and Easy-to-Use Toolkit for Event Understanding |
Hao Peng et.al. |
2309.14258v1 |
link |
2023-09-22 |
Robotic Offline RL from Internet Videos via Value-Function Pre-Training |
Chethan Bhateja et.al. |
2309.13041v1 |
null |
2023-09-22 |
Privacy Assessment on Reconstructed Images: Are Existing Evaluation Metrics Faithful to Human Perception? |
Xiaoxiao Sun et.al. |
2309.13038v1 |
null |
2023-09-22 |
Encoding optimization for quantum machine learning demonstrated on a superconducting transmon qutrit |
Shuxiang Cao et.al. |
2309.13036v1 |
null |
2023-09-22 |
Performance Analysis of UNet and Variants for Medical Image Segmentation |
Walid Ehab et.al. |
2309.13013v1 |
null |
2023-09-22 |
Pursuing Counterfactual Fairness via Sequential Autoencoder Across Domains |
Yujie Lin et.al. |
2309.13005v1 |
null |
2023-09-22 |
Braid groups, elliptic curves, and resolving the quartic |
Peter Huxford et.al. |
2309.12999v1 |
null |
2023-09-22 |
License Plate Recognition Based On Multi-Angle View Model |
Dat Tran-Anh et.al. |
2309.12972v1 |
null |
2023-09-22 |
PI-RADS v2 Compliant Automated Segmentation of Prostate Zones Using co-training Motivated Multi-task Dual-Path CNN |
Arnab Das et.al. |
2309.12970v1 |
null |
2023-09-22 |
Detect Every Thing with Few Examples |
Xinyu Zhang et.al. |
2309.12969v1 |
link |
2023-09-22 |
Massive End-to-end Models for Short Search Queries |
Weiran Wang et.al. |
2309.12963v1 |
null |
2023-09-21 |
ForceSight: Text-Guided Mobile Manipulation with Visual-Force Goals |
Jeremy A. Collins et.al. |
2309.12312v1 |
null |
2023-09-21 |
LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent |
Jianing Yang et.al. |
2309.12311v1 |
null |
2023-09-21 |
TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning |
Chaeyoung Jung et.al. |
2309.12306v1 |
null |
2023-09-22 |
PanoVOS: Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation |
Shilin Yan et.al. |
2309.12303v2 |
link |
2023-09-21 |
See to Touch: Learning Tactile Dexterity through Visual Incentives |
Irmak Guzey et.al. |
2309.12300v1 |
null |
2023-09-21 |
The Broad Impact of Feature Imitation: Neural Enhancements Across Financial, Speech, and Physiological Domains |
Reza Khanmohammadi et.al. |
2309.12279v1 |
null |
2023-09-21 |
Enabling Quartile-based Estimated-Mean Gradient Aggregation As Baseline for Federated Image Classifications |
Yusen Wu et.al. |
2309.12267v1 |
null |
2023-09-21 |
Parallelizing non-linear sequential models over the sequence length |
Yi Heng Lim et.al. |
2309.12252v1 |
null |
2023-09-21 |
Adaptive Input-image Normalization for Solving Mode Collapse Problem in GAN-based X-ray Images |
Muhammad Muneeb Saad et.al. |
2309.12245v1 |
null |
2023-09-21 |
Model-based Clustering using Non-parametric Hidden Markov Models |
Elisabeth Gassiat et.al. |
2309.12238v1 |
null |
2023-09-20 |
A Large-scale Dataset for Audio-Language Representation Learning |
Luoyi Sun et.al. |
2309.11500v1 |
null |
2023-09-20 |
FreeU: Free Lunch in Diffusion U-Net |
Chenyang Si et.al. |
2309.11497v1 |
null |
2023-09-21 |
Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning |
Tianbao Xie et.al. |
2309.11489v2 |
null |
2023-09-20 |
First detection of CO$_2$ emission in a Centaur: JWST NIRSpec observations of 39P/Oterma |
O. Harrington Pinto et.al. |
2309.11486v1 |
null |
2023-09-20 |
Multi-Label Takagi-Sugeno-Kang Fuzzy System |
Qiongdan Lou et.al. |
2309.11469v1 |
null |
2023-09-20 |
Budget-Aware Pruning: Handling Multiple Domains with Less Parameters |
Samuel Felipe dos Santos et.al. |
2309.11464v1 |
null |
2023-09-20 |
AudioFool: Fast, Universal and synchronization-free Cross-Domain Attack on Speech Recognition |
Mohamad Fakih et.al. |
2309.11462v1 |
null |
2023-09-20 |
SkeleTR: Towrads Skeleton-based Action Recognition in the Wild |
Haodong Duan et.al. |
2309.11445v1 |
null |
2023-09-20 |
A Systematic Review of Few-Shot Learning in Medical Imaging |
Eva Pachetti et.al. |
2309.11433v1 |
null |
2023-09-21 |
Video Screens for Hearing Research: Transmittance and Reflectance of Professional and Other Fabrics |
Jan Heeren et.al. |
2309.11430v2 |
null |
2023-09-19 |
Assessing the capacity of a denoising diffusion probabilistic model to reproduce spatial context |
Rucha Deshpande et.al. |
2309.10817v1 |
null |
2023-09-19 |
Multisource Holography |
Grace Kuo et.al. |
2309.10816v1 |
null |
2023-09-19 |
Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning |
Tianhua Zhang et.al. |
2309.10814v1 |
link |
2023-09-19 |
Semantic Text Compression for Classification |
Emrecan Kutay et.al. |
2309.10809v1 |
null |
2023-09-19 |
Multi-Context Dual Hyper-Prior Neural Image Compression |
Atefeh Khoshkhahtinat et.al. |
2309.10799v1 |
null |
2023-09-19 |
Multi-spectral Entropy Constrained Neural Compression of Solar Imagery |
Ali Zafari et.al. |
2309.10791v1 |
null |
2023-09-19 |
Guide Your Agent with Adaptive Multimodal Rewards |
Changyeon Kim et.al. |
2309.10790v1 |
link |
2023-09-19 |
Physics-Informed Machine Learning for Data Anomaly Detection, Classification, Localization, and Mitigation: A Review, Challenges, and Path Forward |
Mehdi Jabbari Zideh et.al. |
2309.10788v1 |
null |
2023-09-19 |
AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models |
Yuan Tseng et.al. |
2309.10787v1 |
link |
2023-09-19 |
Context-Aware Neural Video Compression on Solar Dynamics Observatory |
Atefeh Khoshkhahtinat et.al. |
2309.10784v1 |
null |
2023-09-19 |
Des-q: a quantum algorithm to construct and efficiently retrain decision trees for regression and binary classification |
Niraj Kumar et.al. |
2309.09976v2 |
null |
2023-09-18 |
Empirical Study of Mix-based Data Augmentation Methods in Physiological Time Series Data |
Peikun Guo et.al. |
2309.09970v1 |
null |
2023-09-18 |
vSHARP: variable Splitting Half-quadratic ADMM algorithm for Reconstruction of inverse-Problems |
George Yiasemis et.al. |
2309.09954v1 |
null |
2023-09-18 |
TransientViT: A novel CNN - Vision Transformer hybrid real/bogus transient classifier for the Kilodegree Automatic Transient Survey |
Zhuoyang Chen et.al. |
2309.09937v1 |
null |
2023-09-18 |
Algebra of Self-Replication |
Lawrence S. Moss et.al. |
2309.09931v1 |
null |
2023-09-18 |
Evaluating Adversarial Robustness with Expected Viable Performance |
Ryan McCoppin et.al. |
2309.09928v1 |
null |
2023-09-18 |
Impact of Augmented reality system on elementary school ESL learners in country side of china: Motivations, achievements, behaviors and cognitive attainment |
Ijaz Ul Haq et.al. |
2309.09894v1 |
null |
2023-09-18 |
Not Enough Labeled Data? Just Add Semantics: A Data-Efficient Method for Inferring Online Health Texts |
Joseph Gatto et.al. |
2309.09877v1 |
null |
2023-09-18 |
Domain Generalization with Fourier Transform and Soft Thresholding |
Hongyi Pan et.al. |
2309.09866v1 |
null |
2023-09-18 |
Unsupervised Open-Vocabulary Object Localization in Videos |
Ke Fan et.al. |
2309.09858v1 |
null |
2023-09-18 |
Closing the Loop on Runtime Monitors with Fallback-Safe MPC |
Rohan Sinha et.al. |
2309.08603v2 |
null |
2023-09-15 |
Robust Frame-to-Frame Camera Rotation Estimation in Crowded Scenes |
Fabien Delattre et.al. |
2309.08588v1 |
null |
2023-09-15 |
Compositional Foundation Models for Hierarchical Planning |
Anurag Ajay et.al. |
2309.08587v1 |
null |
2023-09-15 |
HINT: Healthy Influential-Noise based Training to Defend against Data Poisoning Attacks |
Minh-Hao Van et.al. |
2309.08549v1 |
null |
2023-09-15 |
Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal Tokens |
Minsu Kim et.al. |
2309.08531v1 |
null |
2023-09-15 |
Generalised Probabilistic Diffusion Scale-Spaces |
Pascal Peter et.al. |
2309.08511v1 |
null |
2023-09-15 |
Deep-learning-powered data analysis in plankton ecology |
Harshith Bachimanchi et.al. |
2309.08500v1 |
link |
2023-09-15 |
P-ROCKET: Pruning Random Convolution Kernels for Time Series Classification |
Shaowu Chen et.al. |
2309.08499v1 |
link |
2023-09-15 |
YCB-Ev: Event-vision dataset for 6DoF object pose estimation |
Pavel Rojtberg et.al. |
2309.08482v1 |
link |
2023-09-15 |
Current and future directions in network biology |
Marinka Zitnik et.al. |
2309.08478v1 |
null |
2023-09-14 |
Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning |
Zhiwu Qing et.al. |
2309.07911v1 |
link |
2023-09-14 |
Generative Image Dynamics |
Zhengqi Li et.al. |
2309.07906v1 |
null |
2023-09-14 |
Ambiguity-Aware In-Context Learning with Large Language Models |
Lingyu Gao et.al. |
2309.07900v1 |
null |
2023-09-14 |
SMARTFEAT: Efficient Feature Construction through Feature-Level Foundation Model Interactions |
Yin Lin et.al. |
2309.07856v1 |
null |
2023-09-14 |
Two Timin': Repairing Smart Contracts With A Two-Layered Approach |
Abhinav Jain et.al. |
2309.07841v1 |
null |
2023-09-14 |
Text Classification of Cancer Clinical Trial Eligibility Criteria |
Yumeng Yang et.al. |
2309.07812v1 |
null |
2023-09-14 |
What Matters to Enhance Traffic Rule Compliance of Imitation Learning for Automated Driving |
Hongkuan Zhou et.al. |
2309.07808v1 |
null |
2023-09-14 |
Improving Multimodal Classification of Social Media Posts by Leveraging Image-Text Auxiliary tasks |
Danae Sánchez Villegas et.al. |
2309.07794v1 |
null |
2023-09-14 |
A Multi-In and Multi-Out Dendritic Neuron Model and its Optimization |
Yu Ding et.al. |
2309.07791v1 |
null |
2023-09-15 |
Virchow: A Million-Slide Digital Pathology Foundation Model |
Eugene Vorontsov et.al. |
2309.07778v2 |
null |
2023-09-13 |
Contrastive Deep Encoding Enables Uncertainty-aware Machine-learning-assisted Histopathology |
Nirhoshan Sivaroopan et.al. |
2309.07113v1 |
null |
2023-09-13 |
Data Augmentation via Subgroup Mixup for Improving Fairness |
Madeline Navarro et.al. |
2309.07110v1 |
null |
2023-09-13 |
The end sum of surfaces |
Liam K. Axon et.al. |
2309.07101v1 |
null |
2023-09-13 |
Revisiting the classics: On the evolutionary origin of the "Fe II" and "He/N" spectral classes of novae |
E. Aydi et.al. |
2309.07097v1 |
null |
2023-09-13 |
RadarLCD: Learnable Radar-based Loop Closure Detection Pipeline |
Mirko Usuelli et.al. |
2309.07094v1 |
null |
2023-09-13 |
Mitigating Group Bias in Federated Learning for Heterogeneous Devices |
Khotso Selialia et.al. |
2309.07085v1 |
null |
2023-09-13 |
The Boundaries of Verifiable Accuracy, Robustness, and Generalisation in Deep Learning |
Alexander Bastounis et.al. |
2309.07072v1 |
null |
2023-09-13 |
Aggregating Long-term Sharp Features via Hybrid Transformers for Video Deblurring |
Dongwei Ren et.al. |
2309.07054v1 |
link |
2023-09-13 |
Thurston's theorem and the Nielsen-Thurston classification via Teichmüller's theorem |
James Belk et.al. |
2309.06993v1 |
null |
2023-09-13 |
Neural network-based coronary dominance classification of RCA angiograms |
Ivan Kruzhilov et.al. |
2309.06958v1 |
null |
2023-09-12 |
Learning Disentangled Avatars with Hybrid 3D Representations |
Yao Feng et.al. |
2309.06441v1 |
null |
2023-09-12 |
LEAP Hand: Low-Cost, Efficient, and Anthropomorphic Hand for Robot Learning |
Kenneth Shaw et.al. |
2309.06440v1 |
null |
2023-09-12 |
AGMDT: Virtual Staining of Renal Histology Images with Adjacency-Guided Multi-Domain Transfer |
Tao Ma et.al. |
2309.06421v1 |
null |
2023-09-12 |
Style2Fab: Functionality-Aware Segmentation for Fabricating Personalized 3D Models with Generative AI |
Faraz Faruqi et.al. |
2309.06379v1 |
null |
2023-09-12 |
Padding-free Convolution based on Preservation of Differential Characteristics of Kernels |
Kuangdai Leng et.al. |
2309.06370v1 |
null |
2023-09-12 |
Using Reed-Muller Codes for Classification with Rejection and Recovery |
Daniel Fentham et.al. |
2309.06359v1 |
link |
2023-09-12 |
Eccentric graph of trees and their Cartesian products |
Anita Arora et.al. |
2309.06338v1 |
null |
2023-09-12 |
Exploring Flat Minima for Domain Generalization with Large Learning Rates |
Jian Zhang et.al. |
2309.06337v1 |
null |
2023-09-12 |
Grounded Language Acquisition From Object and Action Imagery |
James Robert Kubricht et.al. |
2309.06335v1 |
null |
2023-09-12 |
Visualising Game Engine Subsystem Coupling |
Gabriel C. Ullmann et.al. |
2309.06329v1 |
null |
2023-09-11 |
Diffusion-Guided Reconstruction of Everyday Hand-Object Interaction Clips |
Yufei Ye et.al. |
2309.05663v1 |
null |
2023-09-11 |
From Capture to Display: A Survey on Volumetric Video |
Yili Jin et.al. |
2309.05658v1 |
null |
2023-09-11 |
Potentials of Deterministic Radio Propagation Simulation for AI-Enabled Localization and Sensing |
Albrecht Michler et.al. |
2309.05650v1 |
null |
2023-09-11 |
A Novel Supervised Deep Learning Solution to Detect Distributed Denial of Service (DDoS) attacks on Edge Systems using Convolutional Neural Networks (CNN) |
Vedanth Ramanathan et.al. |
2309.05646v1 |
null |
2023-09-11 |
Boundary Peeling: Outlier Detection Method Using One-Class Peeling |
Sheikh Arafat et.al. |
2309.05630v1 |
null |
2023-09-11 |
Temporal Action Localization with Enhanced Instant Discriminability |
Dingfeng Shi et.al. |
2309.05590v1 |
link |
2023-09-11 |
Anisotropic Diffusion Stencils: From Simple Derivations over Stability Estimates to ResNet Implementations |
Karl Schrader et.al. |
2309.05575v1 |
null |
2023-09-11 |
On the Meromorphic Integrability of the Critical Systems for Optimal Sums of Eigenvalues |
Yuzhou Tian et.al. |
2309.05568v1 |
null |
2023-09-11 |
OpenFashionCLIP: Vision-and-Language Contrastive Learning with Open-Source Fashion Data |
Giuseppe Cartella et.al. |
2309.05551v1 |
link |
2023-09-11 |
Distance-Aware eXplanation Based Learning |
Misgina Tsighe Hagos et.al. |
2309.05548v1 |
link |
2023-09-08 |
Generalized Cross-domain Multi-label Few-shot Learning for Chest X-rays |
Aroof Aimen et.al. |
2309.04462v1 |
null |
2023-09-08 |
Generalized Variable Selection Algorithms for Gaussian Process Models by LASSO-like Penalty |
Zhiyong Hu et.al. |
2309.04455v1 |
null |
2023-09-08 |
Vis-SPLIT: Interactive Hierarchical Modeling for mRNA Expression Classification |
Braden Roper et.al. |
2309.04423v1 |
null |
2023-09-08 |
Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving |
Thomas E. Huang et.al. |
2309.04422v1 |
null |
2023-09-08 |
Seeing-Eye Quadruped Navigation with Force Responsive Locomotion Control |
David DeFazio et.al. |
2309.04370v1 |
null |
2023-09-08 |
Active Learning for Classifying 2D Grid-Based Level Completability |
Mahsa Bazzaz et.al. |
2309.04367v1 |
link |
2023-09-08 |
Sparse Codesigned Communication and Radar Systems |
Hyeon Seok Rou et.al. |
2309.04362v1 |
null |
2023-09-08 |
Learning from Power Signals: An Automated Approach to Electrical Disturbance Identification Within a Power Transmission System |
Jonathan D. Boyd et.al. |
2309.04361v1 |
null |
2023-09-08 |
Zero-Shot Robustification of Zero-Shot Models With Foundation Models |
Dyah Adila et.al. |
2309.04344v1 |
null |
2023-09-08 |
Encoding Multi-Domain Scientific Papers by Ensembling Multiple CLS Tokens |
Ronald Seoh et.al. |
2309.04333v1 |
link |
2023-09-07 |
A-Eval: A Benchmark for Cross-Dataset Evaluation of Abdominal Multi-Organ Segmentation |
Ziyan Huang et.al. |
2309.03906v1 |
link |
2023-09-07 |
ImageBind-LLM: Multi-modality Instruction Tuning |
Jiaming Han et.al. |
2309.03905v1 |
link |
2023-09-07 |
Tracking Anything with Decoupled Video Segmentation |
Ho Kei Cheng et.al. |
2309.03903v1 |
link |
2023-09-07 |
Learning Continuous Exposure Value Representations for Single-Image HDR Reconstruction |
Su-Kai Chen et.al. |
2309.03900v1 |
null |
2023-09-07 |
The Making and Breaking of Camouflage |
Hala Lamdouar et.al. |
2309.03899v1 |
null |
2023-09-07 |
ProPainter: Improving Propagation and Transformer for Video Inpainting |
Shangchen Zhou et.al. |
2309.03897v1 |
null |
2023-09-07 |
Zero-Shot Audio Captioning via Audibility Guidance |
Tal Shaharabany et.al. |
2309.03884v1 |
null |
2023-09-07 |
Text-to-feature diffusion for audio-visual few-shot learning |
Otniel-Bogdan Mercea et.al. |
2309.03869v1 |
null |
2023-09-07 |
Classification of Killing Magnetic Curves In H^3 |
Özgür Kelekçi et.al. |
2309.03859v1 |
null |
2023-09-07 |
CenTime: Event-Conditional Modelling of Censoring in Survival Analysis |
Ahmed H. Shahin et.al. |
2309.03851v1 |
link |
2023-09-07 |
Terahertz-Band Direction Finding With Beam-Split and Mutual Coupling Calibration |
Ahmet M. Elbir et.al. |
2309.03195v2 |
null |
2023-09-06 |
Signatures of Bayesian inference emerge from energy efficient synapses |
James Malkin et.al. |
2309.03194v1 |
null |
2023-09-06 |
3D Transformer based on deformable patch location for differential diagnosis between Alzheimer's disease and Frontotemporal dementia |
Huy-Dung Nguyen et.al. |
2309.03183v1 |
null |
2023-09-06 |
PDiscoNet: Semantically consistent part discovery for fine-grained recognition |
Robert van der Klis et.al. |
2309.03173v1 |
null |
2023-09-06 |
ResFields: Residual Neural Fields for Spatiotemporal Signals |
Marko Mihajlovic et.al. |
2309.03160v1 |
null |
2023-09-06 |
Normal mode decomposition of atomic motion in solids |
Jaeyun Moon et.al. |
2309.03140v1 |
null |
2023-09-06 |
Serving Time: Real-Time, Safe Motion Planning and Control for Manipulation of Unsecured Objects |
Zachary Brei et.al. |
2309.03111v1 |
null |
2023-09-06 |
The Secrets of Non-Blind Poisson Deconvolution |
Abhiram Gnanasambandam et.al. |
2309.03105v1 |
null |
2023-09-06 |
On the $Σ$-invariants of Artin groups satisfying the $K(π,1)$-conjecture |
Marcos Escartín Ferrer et.al. |
2309.03091v1 |
null |
2023-09-06 |
Hide and Seek (HaS): A Lightweight Framework for Prompt Privacy Protection |
Yu Chen et.al. |
2309.03057v1 |
null |
2023-09-05 |
ReliTalk: Relightable Talking Portrait Generation from a Single Video |
Haonan Qiu et.al. |
2309.02434v1 |
link |
2023-09-05 |
A Likelihood Approach to Incorporating Self-Report Data in HIV Recency Classification |
Wenlong Yang et.al. |
2309.02430v1 |
null |
2023-09-05 |
Building a Winning Team: Selecting Source Model Ensembles using a Submodular Transferability Estimation Approach |
Vimal K B et.al. |
2309.02429v1 |
null |
2023-09-05 |
EgoPCA: A New Framework for Egocentric Hand-Object Interaction Understanding |
Yue Xu et.al. |
2309.02423v1 |
null |
2023-09-05 |
Doppelgangers: Learning to Disambiguate Images of Similar Structures |
Ruojin Cai et.al. |
2309.02420v1 |
link |
2023-09-05 |
Classification of La3+ and Gd3+ rare earth ions using surface-enhanced Raman scattering |
Hao Jin et.al. |
2309.02409v1 |
null |
2023-09-05 |
Semantic Communications Based on Adaptive Generative Models and Information Bottleneck |
S. Barbarossa et.al. |
2309.02387v1 |
null |
2023-09-05 |
On the classification of primitive ideals for complex classical Lie algebras, IV |
William McGovern et.al. |
2309.02363v1 |
null |
2023-09-05 |
Generating Infinite-Resolution Texture using GANs with Patch-by-Patch Paradigm |
Alhasan Abdellatif et.al. |
2309.02340v1 |
null |
2023-09-05 |
DEEPBEAS3D: Deep Learning and B-Spline Explicit Active Surfaces |
Helena Williams et.al. |
2309.02335v1 |
null |
2023-09-01 |
Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following |
Ziyu Guo et.al. |
2309.00615v1 |
link |
2023-09-01 |
Amyloid-Beta Axial Plane PET Synthesis from Structural MRI: An Image Translation Approach for Screening Alzheimer's Disease |
Fernando Vega et.al. |
2309.00569v1 |
null |
2023-09-01 |
Powder-Bot: A Modular Autonomous Multi-Robot Workflow for Powder X-Ray Diffraction |
Amy M. Lunt et.al. |
2309.00544v1 |
null |
2023-09-01 |
A Machine Vision Method for Correction of Eccentric Error: Based on Adaptive Enhancement Algorithm |
Fanyi Wang et.al. |
2309.00514v1 |
null |
2023-09-01 |
Multi-stage Deep Learning Artifact Reduction for Computed Tomography |
Jiayang Shi et.al. |
2309.00494v1 |
null |
2023-09-01 |
Geometry-aware Line Graph Transformer Pre-training for Molecular Property Prediction |
Peizhen Bai et.al. |
2309.00483v1 |
null |
2023-09-01 |
Deep Joint Source-Channel Coding for Adaptive Image Transmission over MIMO Channels |
Haotian Wu et.al. |
2309.00470v1 |
null |
2023-09-01 |
New metrics for analyzing continual learners |
Nicolas Michel et.al. |
2309.00462v1 |
null |
2023-09-01 |
The miniJPAS survey quasar selection IV: Classification and redshift estimation with SQUEzE |
Ignasi Pérez-Ràfols et.al. |
2309.00461v1 |
null |
2023-09-01 |
CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding |
Étienne Labbé et.al. |
2309.00454v1 |
link |
2023-08-31 |
PointLLM: Empowering Large Language Models to Understand Point Clouds |
Runsen Xu et.al. |
2308.16911v1 |
link |
2023-08-31 |
StyleInV: A Temporal Style Modulated Inversion Network for Unconditional Video Generation |
Yuhan Wang et.al. |
2308.16909v1 |
link |
2023-08-31 |
Learning to Taste: A Multimodal Wine Dataset |
Thoranna Bender et.al. |
2308.16900v1 |
null |
2023-08-31 |
EMDB: The Electromagnetic Database of Global 3D Human Pose and Shape in the Wild |
Manuel Kaufmann et.al. |
2308.16894v1 |
link |
2023-08-31 |
On the Role of Non-Localities in Fundamental Diagram Estimation |
Jing Liu et.al. |
2308.16878v1 |
null |
2023-08-31 |
SportsSloMo: A New Benchmark and Baselines for Human-centric Video Frame Interpolation |
Jiaben Chen et.al. |
2308.16876v1 |
null |
2023-08-31 |
Understanding defects in amorphous silicon with million-atom simulations and machine learning |
Joe D. Morrow et.al. |
2308.16868v1 |
null |
2023-08-31 |
Self-pruning Graph Neural Network for Predicting Inflammatory Disease Activity in Multiple Sclerosis from Brain MR Images |
Chinmay Prabhakar et.al. |
2308.16863v1 |
link |
2023-08-31 |
Facing Unknown: Open-World Encrypted Traffic Classification Based on Contrastive Pre-Training |
Xiang Li et.al. |
2308.16861v1 |
null |
2023-08-31 |
Majorization-Minimization for sparse SVMs |
Alessandro Benfenati et.al. |
2308.16858v1 |
null |
2023-08-30 |
Fully Non-Linear Neuromorphic Computing with Linear Wave Scattering |
Clara C. Wanjura et.al. |
2308.16181v1 |
null |
2023-08-30 |
General Purpose Audio Effect Removal |
Matthew Rice et.al. |
2308.16177v1 |
null |
2023-08-30 |
Algebraic, Topological, and Mereological Foundations of Existential Granules |
Mani A et.al. |
2308.16157v1 |
null |
2023-08-31 |
MMVP: Motion-Matrix-based Video Prediction |
Yiqi Zhong et.al. |
2308.16154v2 |
link |
2023-08-30 |
Modality Cycles with Masked Conditional Diffusion for Unsupervised Anomaly Segmentation in MRI |
Ziyun Liang et.al. |
2308.16150v1 |
null |
2023-08-30 |
Spatial Graph Coarsening: Weather and Weekday Prediction with London's Bike-Sharing Service using GNN |
Yuta Sato et.al. |
2308.16122v1 |
null |
2023-08-30 |
Learned Image Reasoning Prior Penetrates Deep Unfolding Network for Panchromatic and Multi-Spectral Image Fusion |
Man Zhou et.al. |
2308.16083v1 |
null |
2023-08-30 |
A Classification of Observation-Driven State-Space Count Models for Panel Data |
Jae Youn Ahn et.al. |
2308.16058v1 |
null |
2023-08-30 |
Low-Rank Multitask Learning based on Tensorized SVMs and LSSVMs |
Jiani Liu et.al. |
2308.16056v1 |
null |
2023-08-30 |
Telepresence Lantern -- Designing an Immersive Video-Mediated Communication Device for Older Adults |
Thomas H. Weisswange et.al. |
2308.16052v1 |
null |
2023-08-29 |
An Adaptive Tangent Feature Perspective of Neural Networks |
Daniel LeJeune et.al. |
2308.15478v1 |
null |
2023-08-29 |
A General-Purpose Self-Supervised Model for Computational Pathology |
Richard J. Chen et.al. |
2308.15474v1 |
null |
2023-08-29 |
Learning Modulated Transformation in GANs |
Ceyuan Yang et.al. |
2308.15472v1 |
null |
2023-08-30 |
Policy composition in reinforcement learning via multi-objective policy optimization |
Shruti Mishra et.al. |
2308.15470v2 |
null |
2023-08-29 |
Input margins can predict generalization too |
Coenraad Mouton et.al. |
2308.15466v1 |
null |
2023-08-29 |
A Comparative Study of Loss Functions: Traffic Predictions in Regular and Congestion Scenarios |
Yangxinyu Xie et.al. |
2308.15464v1 |
link |
2023-08-29 |
Online Overexposed Pixels Hallucination in Videos with Adaptive Reference Frame Selection |
Yazhou Xing et.al. |
2308.15462v1 |
null |
2023-08-29 |
From SMOTE to Mixup for Deep Imbalanced Classification |
Wei-Chao Cheng et.al. |
2308.15457v1 |
link |
2023-08-29 |
Pseudo-Boolean Polynomials Approach To Edge Detection And Image Segmentation |
Tendai Mapungwana Chikake et.al. |
2308.15453v1 |
null |
2023-08-29 |
WrappingNet: Mesh Autoencoder via Deep Sphere Deformation |
Eric Lei et.al. |
2308.15413v1 |
null |
2023-08-28 |
MagicEdit: High-Fidelity and Temporally Coherent Video Editing |
Jun Hao Liew et.al. |
2308.14749v1 |
null |
2023-08-28 |
MagicAvatar: Multimodal Avatar Generation and Animation |
Jianfeng Zhang et.al. |
2308.14748v1 |
null |
2023-08-28 |
CoVR: Learning Composed Video Retrieval from Web Video Captions |
Lucas Ventura et.al. |
2308.14746v1 |
link |
2023-08-28 |
Total Selfie: Generating Full-Body Selfies |
Bowei Chen et.al. |
2308.14740v1 |
null |
2023-08-28 |
PanoSwin: a Pano-style Swin Transformer for Panorama Understanding |
Zhixin Ling et.al. |
2308.14726v1 |
null |
2023-08-28 |
VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation |
Xudong Wang et.al. |
2308.14710v1 |
link |
2023-08-28 |
Fine-Tuning Llama 2 Large Language Models for Detecting Online Sexual Predatory Chats and Abusive Texts |
Thanh Thi Nguyen et.al. |
2308.14683v1 |
null |
2023-08-28 |
Video-Based Hand Pose Estimation for Remote Assessment of Bradykinesia in Parkinson's Disease |
Gabriela T. Acevedo Trebbau et.al. |
2308.14679v1 |
null |
2023-08-28 |
Noncommutative tensor triangular geometry: classification via noetherian spectra |
James Rowe et.al. |
2308.14661v1 |
null |
2023-08-28 |
Towards Standardized Disturbance Rejection Testing of Legged Robot Locomotion with Linear Impactor: A Preliminary Study, Observations, and Implications |
Bowen Weng et.al. |
2308.14636v1 |
null |
2023-08-25 |
Unveiling the Role of Message Passing in Dual-Privacy Preservation on GNNs |
Tianyi Zhao et.al. |
2308.13513v1 |
null |
2023-08-25 |
Joint Modeling of Feature, Correspondence, and a Compressed Memory for Video Object Segmentation |
Jiaming Zhang et.al. |
2308.13505v1 |
null |
2023-08-25 |
Attending Generalizability in Course of Deep Fake Detection by Exploring Multi-task Learning |
Pranav Balaji et.al. |
2308.13503v1 |
null |
2023-08-25 |
Eventful Transformers: Leveraging Temporal Redundancy in Vision Transformers |
Matthew Dutson et.al. |
2308.13494v1 |
link |
2023-08-25 |
Temporal Uncertainty Localization to Enable Human-in-the-loop Analysis of Dynamic Contrast-enhanced Cardiac MRI Datasets |
Dilek M. Yalcinkaya et.al. |
2308.13488v1 |
null |
2023-08-25 |
QKSAN: A Quantum Kernel Self-Attention Network |
Ren-Xin Zhao et.al. |
2308.13422v1 |
null |
2023-08-25 |
An investigation into the impact of deep learning model choice on sex and race bias in cardiac MR segmentation |
Tiarna Lee et.al. |
2308.13415v1 |
null |
2023-08-25 |
Self-Supervised Representation Learning with Cross-Context Learning between Global and Hypercolumn Features |
Zheng Gao et.al. |
2308.13392v1 |
null |
2023-08-25 |
Direction-aware Video Demoireing with Temporal-guided Bilateral Learning |
Shuning Xu et.al. |
2308.13388v1 |
null |
2023-08-25 |
On flags of holomorphic foliations associated with singular second-order ordinary differential equations |
Fernando Lourenço et.al. |
2308.13370v1 |
null |
2023-08-24 |
POCO: 3D Pose and Shape Estimation with Confidence |
Sai Kumar Dwivedi et.al. |
2308.12965v1 |
null |
2023-08-24 |
Motion-Guided Masking for Spatiotemporal Representation Learning |
David Fan et.al. |
2308.12962v1 |
null |
2023-08-24 |
Towards Realistic Zero-Shot Classification via Self Structural Semantic Alignment |
Sheng Zhang et.al. |
2308.12960v1 |
link |
2023-08-24 |
Beyond Document Page Classification: Design, Datasets, and Challenges |
Jordy Van Landeghem et.al. |
2308.12896v1 |
null |
2023-08-24 |
Large Language Models Vote: Prompting for Rare Disease Identification |
David Oniani et.al. |
2308.12890v1 |
link |
2023-08-24 |
Multi-stage feature decorrelation constraints for improving CNN classification performance |
Qiuyu Zhu et.al. |
2308.12880v1 |
null |
2023-08-24 |
ToonTalker: Cross-Domain Face Reenactment |
Yuan Gong et.al. |
2308.12866v1 |
null |
2023-08-24 |
Learned Local Attention Maps for Synthesising Vessel Segmentations |
Yash Deo et.al. |
2308.12861v1 |
null |
2023-08-24 |
Algebraicity of hypergeometric functions with arbitrary parameters |
Florian Fürnsinn et.al. |
2308.12855v1 |
null |
2023-08-24 |
$p$-brane Galilean and Carrollian Geometries and Gravities |
Eric Bergshoeff et.al. |
2308.12852v1 |
null |
2023-08-23 |
Simple is Better and Large is Not Enough: Towards Ensembling of Foundational Language Models |
Nancy Tyagi et.al. |
2308.12272v1 |
null |
2023-08-23 |
Bugsplainer: Leveraging Code Structures to Explain Software Bugs with Neural Machine Translation |
Parvez Mahbub et.al. |
2308.12267v1 |
null |
2023-08-23 |
SPPNet: A Single-Point Prompt Network for Nuclei Image Segmentation |
Qing Xu et.al. |
2308.12231v1 |
link |
2023-08-23 |
Towards Real-Time Analysis of Broadcast Badminton Videos |
Nitin Nilesh et.al. |
2308.12199v1 |
null |
2023-08-23 |
Sign Language Translation with Iterative Prototype |
Huijie Yao et.al. |
2308.12191v1 |
null |
2023-08-23 |
Tumor-Centered Patching for Enhanced Medical Image Segmentation |
Mutyyba Asghar et.al. |
2308.12168v1 |
null |
2023-08-23 |
Constant mean curvature hypersurfaces in Anti-de Sitter space |
Enrico Trebeschi et.al. |
2308.12167v1 |
null |
2023-08-23 |
NPF-200: A Multi-Modal Eye Fixation Dataset and Method for Non-Photorealistic Videos |
Ziyu Yang et.al. |
2308.12163v1 |
null |
2023-08-23 |
A Probabilistic Fluctuation based Membership Inference Attack for Generative Models |
Wenjie Fu et.al. |
2308.12143v1 |
null |
2023-08-23 |
Masking Strategies for Background Bias Removal in Computer Vision Models |
Ananthu Aniraj et.al. |
2308.12127v1 |
link |
2023-08-22 |
StoryBench: A Multifaceted Benchmark for Continuous Story Visualization |
Emanuele Bugliarello et.al. |
2308.11606v1 |
link |
2023-08-22 |
Semantic Multi-Resolution Communications |
Matin Mortaheb et.al. |
2308.11604v1 |
null |
2023-08-22 |
EndoNet: model for automatic calculation of H-score on histological slides |
Egor Ushakov et.al. |
2308.11562v1 |
null |
2023-08-22 |
Open Set Synthetic Image Source Attribution |
Shengbang Fang et.al. |
2308.11557v1 |
null |
2023-08-22 |
Multi-event Video-Text Retrieval |
Gengyuan Zhang et.al. |
2308.11551v1 |
link |
2023-08-22 |
Furnishing Sound Event Detection with Language Model Abilities |
Hualei Wang et.al. |
2308.11530v1 |
null |
2023-08-22 |
LCCo: Lending CLIP to Co-Segmentation |
Xin Duan et.al. |
2308.11506v1 |
null |
2023-08-23 |
Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video Recognition |
Qitong Wang et.al. |
2308.11489v2 |
link |
2023-08-22 |
Opening the Vocabulary of Egocentric Actions |
Dibyadip Chatterjee et.al. |
2308.11488v1 |
null |
2023-08-22 |
Free Lunch for Gait Recognition: A Novel Relation Descriptor |
Jilong Wang et.al. |
2308.11487v1 |
null |
2023-08-21 |
Structured World Models from Human Videos |
Russell Mendonca et.al. |
2308.10901v1 |
null |
2023-08-21 |
Unlocking Accuracy and Fairness in Differentially Private Image Classification |
Leonard Berrada et.al. |
2308.10888v1 |
null |
2023-08-21 |
Evaluating quantum generative models via imbalanced data classification benchmarks |
Graham R. Enos et.al. |
2308.10847v1 |
null |
2023-08-21 |
Pixel Adaptive Deep Unfolding Transformer for Hyperspectral Image Reconstruction |
Miaoyu Li et.al. |
2308.10820v1 |
null |
2023-08-21 |
Improving Continuous Sign Language Recognition with Cross-Lingual Signs |
Fangyun Wei et.al. |
2308.10809v1 |
null |
2023-08-21 |
DynED: Dynamic Ensemble Diversification in Data Stream Classification |
Soheil Abadifard et.al. |
2308.10807v1 |
link |
2023-08-21 |
MGMAE: Motion Guided Masking for Video Masked Autoencoding |
Bingkun Huang et.al. |
2308.10794v1 |
null |
2023-08-21 |
Extraction of Text from Optic Nerve Optical Coherence Tomography Reports |
Iyad Majid et.al. |
2308.10790v1 |
null |
2023-08-21 |
Dense Error Map Estimation for MRI-Ultrasound Registration in Brain Tumor Surgery Using Swin UNETR |
Soorena Salari et.al. |
2308.10784v1 |
null |
2023-08-21 |
Superfluid weight in the isolated band limit within the generalized random phase approximation |
Minh Tam et.al. |
2308.10780v1 |
null |
2023-08-18 |
Diff2Lip: Audio Conditioned Diffusion Models for Lip-Synchronization |
Soumik Mukhopadhyay et.al. |
2308.09716v1 |
link |
2023-08-18 |
Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis |
Jonathon Luiten et.al. |
2308.09713v1 |
null |
2023-08-18 |
SimDA: Simple Diffusion Adapter for Efficient Video Generation |
Zhen Xing et.al. |
2308.09710v1 |
null |
2023-08-18 |
Invariant Training 2D-3D Joint Hard Samples for Few-Shot Point Cloud Recognition |
Xuanyu Yi et.al. |
2308.09694v1 |
null |
2023-08-18 |
A Lightweight Transformer for Faster and Robust EBSD Data Collection |
Harry Dong et.al. |
2308.09693v1 |
link |
2023-08-18 |
Audiovisual Moments in Time: A Large-Scale Annotated Dataset of Audiovisual Actions |
Michael Joannou et.al. |
2308.09685v1 |
link |
2023-08-18 |
Quantifying Uncertainties of Contact Classifications in a Human-Robot Collaboration with Parallel Robots |
Aran Mohammad et.al. |
2308.09675v1 |
null |
2023-08-18 |
Classification of modular data up to rank 11 |
Siu-Hung Ng et.al. |
2308.09670v1 |
null |
2023-08-18 |
Collision Isolation and Identification Using Proprioceptive Sensing for Parallel Robots to Enable Human-Robot Collaboration |
Aran Mohammad et.al. |
2308.09650v1 |
null |
2023-08-18 |
Robust Uncertainty Quantification using Conformalised Monte Carlo Prediction |
Daniel Bethell et.al. |
2308.09647v1 |
link |
2023-08-16 |
MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions |
Henghui Ding et.al. |
2308.08544v1 |
link |
2023-08-16 |
Deployment and Analysis of Instance Segmentation Algorithm for In-field Grade Estimation of Sweetpotatoes |
Hoang M. Nguyen et.al. |
2308.08534v1 |
null |
2023-08-16 |
Diagnosing Human-object Interaction Detectors |
Fangrui Zhu et.al. |
2308.08529v1 |
link |
2023-08-17 |
Exploiting Point-Wise Attention in 6D Object Pose Estimation Based on Bidirectional Prediction |
Yuhao Yang et.al. |
2308.08518v2 |
null |
2023-08-17 |
Two-and-a-half Order Score-based Model for Solving 3D Ill-posed Inverse Problems |
Zirong Li et.al. |
2308.08511v2 |
null |
2023-08-16 |
ResBuilder: Automated Learning of Depth with Residual Structures |
Julian Burghoff et.al. |
2308.08504v1 |
null |
2023-08-16 |
Galactic Archaeology: Tracing the Milky Way's Formation and Evolution through Stellar Populations |
J. Alfredo Collazos et.al. |
2308.08492v1 |
null |
2023-08-16 |
Label Propagation Techniques for Artifact Detection in Imbalanced Classes using Photoplethysmogram Signals |
Clara Macabiau et.al. |
2308.08480v1 |
null |
2023-08-16 |
DeDoDe: Detect, Don't Describe -- Describe, Don't Detect for Local Feature Matching |
Johan Edstedt et.al. |
2308.08479v1 |
link |
2023-08-16 |
Classification Committee for Active Deep Object Detection |
Lei Zhao et.al. |
2308.08476v1 |
null |
2023-08-15 |
CoDeF: Content Deformation Fields for Temporally Consistent Video Processing |
Hao Ouyang et.al. |
2308.07926v1 |
link |
2023-08-15 |
Helping Hands: An Object-Aware Ego-Centric Video Recognition Model |
Chuhan Zhang et.al. |
2308.07918v1 |
link |
2023-08-15 |
Relightable and Animatable Neural Avatar from Sparse-View Video |
Zhen Xu et.al. |
2308.07903v1 |
null |
2023-08-15 |
Back to Basics: A Sanity Check on Modern Time Series Classification Algorithms |
Bhaskar Dhariyal et.al. |
2308.07886v1 |
link |
2023-08-15 |
The Challenge of Fetal Cardiac MRI Reconstruction Using Deep Learning |
Denis Prokopenko et.al. |
2308.07885v1 |
null |
2023-08-15 |
Towards Temporal Edge Regression: A Case Study on Agriculture Trade Between Nations |
Lekang Jiang et.al. |
2308.07883v1 |
link |
2023-08-15 |
Synthesizing Political Zero-Shot Relation Classification via Codebook Knowledge, NLI, and ChatGPT |
Yibo Hu et.al. |
2308.07876v1 |
null |
2023-08-15 |
SEDA: Self-Ensembling ViT with Defensive Distillation and Adversarial Training for robust Chest X-rays Classification |
Raza Imam et.al. |
2308.07874v1 |
link |
2023-08-15 |
Sequence Processing with Quantum Tensor Networks |
Carys Harvey et.al. |
2308.07865v1 |
null |
2023-08-15 |
ImbSAM: A Closer Look at Sharpness-Aware Minimization in Class-Imbalanced Recognition |
Yixuan Zhou et.al. |
2308.07815v1 |
link |
2023-08-14 |
Comparison between parameter-efficient techniques and full fine-tuning: A case study on multilingual news article classification |
Olesya Razuvayevskaya et.al. |
2308.07282v1 |
null |
2023-08-14 |
A Robust Approach Towards Distinguishing Natural and Computer Generated Images using Multi-Colorspace fused and Enriched Vision Transformer |
Manjary P Gangan et.al. |
2308.07279v1 |
null |
2023-08-14 |
EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models |
Peng Wang et.al. |
2308.07269v1 |
link |
2023-08-14 |
Diving with Penguins: Detecting Penguins and their Prey in Animal-borne Underwater Videos via Deep Learning |
Kejia Zhang et.al. |
2308.07267v1 |
null |
2023-08-14 |
Large-kernel Attention for Efficient and Robust Brain Lesion Segmentation |
Liam Chalcroft et.al. |
2308.07251v1 |
link |
2023-08-14 |
LCE -- An Augmented Combination of Bagging and Boosting in Python |
Kevin Fauvel et.al. |
2308.07250v1 |
link |
2023-08-14 |
Large-scale environment mapping and immersive human-robot interaction for agricultural mobile robot teleoperation |
Tao Liu et.al. |
2308.07231v1 |
null |
2023-08-14 |
Almost fine gradings on algebras and classification of gradings up to isomorphism |
Alberto Elduque et.al. |
2308.07230v1 |
null |
2023-08-14 |
Distance Matters For Improving Performance Estimation Under Covariate Shift |
Mélanie Roschewitz et.al. |
2308.07223v1 |
link |
2023-08-15 |
AudioFormer: Audio Transformer learns audio feature representations from discrete acoustic codes |
Zhaohui Li et.al. |
2308.07221v2 |
link |
2023-08-11 |
ARGUS: Visualization of AI-Assisted Task Guidance in AR |
Sonia Castelo et.al. |
2308.06246v1 |
null |
2023-08-11 |
Exploring Predicate Visual Context in Detecting of Human-Object Interactions |
Frederic Z. Zhang et.al. |
2308.06202v1 |
link |
2023-08-11 |
Weakly Supervised Text Classification on Free Text Comments in Patient-Reported Outcome Measures |
Anna-Grace Linton et.al. |
2308.06199v1 |
null |
2023-08-11 |
Physical Adversarial Attacks For Camera-based Smart Systems: Current Trends, Categorization, Applications, Research Challenges, and Future Outlook |
Amira Guesmi et.al. |
2308.06173v1 |
null |
2023-08-11 |
Extrinsic geometry and linear differential equations of $\mathfrak{sl}_3$-type |
Boris Doubrov et.al. |
2308.06169v1 |
null |
2023-08-11 |
Rethinking the Localization in Weakly Supervised Object Localization |
Rui Xu et.al. |
2308.06161v1 |
null |
2023-08-11 |
Identification of the Relevance of Comments in Codes Using Bag of Words and Transformer Based Models |
Sruthi S et.al. |
2308.06144v1 |
link |
2023-08-11 |
Lip2Vec: Efficient and Robust Visual Speech Recognition via Latent-to-Latent Visual to Audio Representation Mapping |
Yasser Abdelaziz Dahou Djilali et.al. |
2308.06112v1 |
null |
2023-08-11 |
Diffusion-based Visual Counterfactual Explanations -- Towards Systematic Quantitative Evaluation |
Philipp Vaeth et.al. |
2308.06100v1 |
link |
2023-08-11 |
Automated Construction of Time-Space Diagrams for Traffic Analysis Using Street-View Video Sequence |
Tanay Rastogi et.al. |
2308.06098v1 |
null |
2023-08-10 |
Follow Anything: Open-set detection, tracking, and following in real-time |
Alaa Maalouf et.al. |
2308.05737v1 |
link |
2023-08-10 |
FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models |
Guangkai Xu et.al. |
2308.05733v1 |
null |
2023-08-10 |
Optimizing Performance of Feedforward and Convolutional Neural Networks through Dynamic Activation Functions |
Chinmay Rane et.al. |
2308.05724v1 |
null |
2023-08-10 |
Towards the Automorphism Conjecture I: Combinatorial Control and Compensation for Factorials |
Bernd S. W. Schröder et.al. |
2308.05715v1 |
null |
2023-08-10 |
Automatic Extraction of Relevant Road Infrastructure using Connected vehicle data and Deep Learning Model |
Adu-Gyamfi Kojo et.al. |
2308.05658v1 |
null |
2023-08-10 |
Attention-based 3D CNN with Multi-layer Features for Alzheimer's Disease Diagnosis using Brain Images |
Yanteng Zhang et.al. |
2308.05655v1 |
null |
2023-08-10 |
Counterfactual Cross-modality Reasoning for Weakly Supervised Video Moment Localization |
Zezhong Lv et.al. |
2308.05648v1 |
link |
2023-08-10 |
Self-Supervised Monocular Depth Estimation by Direction-aware Cumulative Convolution Network |
Wencheng Han et.al. |
2308.05605v1 |
link |
2023-08-10 |
Object Goal Navigation with Recursive Implicit Maps |
Shizhe Chen et.al. |
2308.05602v1 |
null |
2023-08-10 |
You Only Prompt Once: On the Capabilities of Prompt Learning on Large Language Models to Tackle Toxic Content |
Xinlei He et.al. |
2308.05596v1 |
null |
2023-08-09 |
Improved Multi-Shot Diffusion-Weighted MRI with Zero-Shot Self-Supervised Learning Reconstruction |
Jaejin Cho et.al. |
2308.05103v1 |
link |
2023-08-09 |
DOST -- Domain Obedient Self-supervised Training for Multi Label Classification with Noisy Labels |
Soumadeep Saha et.al. |
2308.05101v1 |
null |
2023-08-09 |
Constructing Holistic Spatio-Temporal Scene Graph for Video Semantic Role Labeling |
Yu Zhao et.al. |
2308.05081v1 |
null |
2023-08-10 |
Geometric Learning-Based Transformer Network for Estimation of Segmentation Errors |
Sneha Sree C et.al. |
2308.05068v2 |
null |
2023-08-09 |
PAT: Position-Aware Transformer for Dense Multi-Label Action Detection |
Faegheh Sardari et.al. |
2308.05051v1 |
null |
2023-08-09 |
Collaborative Wideband Spectrum Sensing and Scheduling for Networked UAVs in UTM Systems |
Sravan Reddy Chintareddy et.al. |
2308.05036v1 |
null |
2023-08-09 |
Expert load matters: operating networks at high accuracy and low manual effort |
Sara Sangalli et.al. |
2308.05035v1 |
null |
2023-08-09 |
MetRoBERTa: Leveraging Traditional Customer Relationship Management Data to Develop a Transit-Topic-Aware Language Model |
Michael Leong et.al. |
2308.05012v1 |
null |
2023-08-09 |
Exploring Multilingual Text Data Distillation |
Shivam Sahni et.al. |
2308.04982v1 |
link |
2023-08-09 |
CasCIFF: A Cross-Domain Information Fusion Framework Tailored for Cascade Prediction in Social Networks |
Hongjun Zhu et.al. |
2308.04961v1 |
null |
2023-08-08 |
A Deep-Learning Method Using Auto-encoder and Generative Adversarial Network for Anomaly Detection on Ancient Stone Stele Surfaces |
Yikun Liu et.al. |
2308.04426v1 |
null |
2023-08-08 |
A Bi-directional Multi-hop Inference Model for Joint Dialog Sentiment Classification and Act Recognition |
Li Zheng et.al. |
2308.04424v1 |
null |
2023-08-08 |
DiffCR: A Fast Conditional Diffusion Framework for Cloud Removal from Optical Satellite Images |
Xuechao Zou et.al. |
2308.04417v1 |
null |
2023-08-08 |
Probabilistic Invariant Learning with Randomized Linear Classifiers |
Leonardo Cotta et.al. |
2308.04412v1 |
null |
2023-08-08 |
Data Augmentation-Based Unsupervised Domain Adaptation In Medical Imaging |
Sebastian Nørgaard Llambias et.al. |
2308.04395v1 |
null |
2023-08-08 |
SSTFormer: Bridging Spiking Neural Network and Memory Support Transformer for Frame-Event based Recognition |
Xiao Wang et.al. |
2308.04369v1 |
link |
2023-08-08 |
Vascular Ageing and Smoking Habit Prediction via a Low-Cost Single-Lead ECG Module |
S. Anas Ali et.al. |
2308.04355v1 |
null |
2023-08-08 |
A Lightweight and Accurate Face Detection Algorithm Based on Retinaface |
Baozhu Liu et.al. |
2308.04340v1 |
null |
2023-08-08 |
Pengembangan Model untuk Mendeteksi Kerusakan pada Terumbu Karang dengan Klasifikasi Citra |
Fadhil Muhammad et.al. |
2308.04337v1 |
null |
2023-08-08 |
Embracing Safe Contacts with Contact-aware Planning and Control |
Zhaoting Li et.al. |
2308.04323v1 |
null |
2023-08-07 |
3D Motion Magnification: Visualizing Subtle Motions with Time Varying Radiance Fields |
Brandon Y. Feng et.al. |
2308.03757v1 |
null |
2023-08-07 |
What about translation? New coding system for content analysis on the perception of literary translation around the political transformation in 1989 in Hungary as a classification problem on an unbalanced dataset |
Dalma Galambos et.al. |
2308.03742v1 |
null |
2023-08-07 |
Efficient Temporal Sentence Grounding in Videos with Multi-Teacher Knowledge Distillation |
Renjie Liang et.al. |
2308.03725v1 |
null |
2023-08-07 |
Automated Real Time Delineation of Supraclavicular Brachial Plexus in Neck Ultrasonography Videos: A Deep Learning Approach |
Abhay Tyagi et.al. |
2308.03717v1 |
null |
2023-08-08 |
Communication-Efficient Framework for Distributed Image Semantic Wireless Transmission |
Bingyan Xie et.al. |
2308.03713v2 |
null |
2023-08-07 |
Scaling may be all you need for achieving human-level object recognition capacity with human-like visual experience |
A. Emin Orhan et.al. |
2308.03712v1 |
link |
2023-08-07 |
Video-based Person Re-identification with Long Short-Term Representation Learning |
Xuehu Liu et.al. |
2308.03703v1 |
null |
2023-08-08 |
Screen-based 3D Subjective Experiment Software |
Songlin Fan et.al. |
2308.03698v2 |
null |
2023-08-07 |
Learning Concise and Descriptive Attributes for Visual Recognition |
An Yan et.al. |
2308.03685v1 |
null |
2023-08-07 |
Detecting Spells in Fantasy Literature with a Transformer Based Artificial Intelligence |
Marcel Moravek et.al. |
2308.03660v1 |
null |
2023-08-04 |
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP |
Qihang Yu et.al. |
2308.02487v1 |
link |
2023-08-04 |
BlindSage: Label Inference Attacks against Node-level Vertical Federated Graph Neural Networks |
Marco Arazzi et.al. |
2308.02465v1 |
null |
2023-08-04 |
Nonprehensile Planar Manipulation through Reinforcement Learning with Multimodal Categorical Exploration |
Juan Del Aguila Ferrandis et.al. |
2308.02459v1 |
null |
2023-08-04 |
Getting the Ball Rolling: Learning a Dexterous Policy for a Biomimetic Tendon-Driven Hand with Rolling Contact Joints |
Yasunori Toshimitsu et.al. |
2308.02453v1 |
null |
2023-08-04 |
Adaptive Preferential Attached kNN Graph With Distribution-Awareness |
Shaojie Min et.al. |
2308.02442v1 |
link |
2023-08-04 |
Scaling Survival Analysis in Healthcare with Federated Survival Forests: A Comparative Study on Heart Failure and Breast Cancer Genomics |
Alberto Archetti et.al. |
2308.02382v1 |
null |
2023-08-04 |
Brain MRI Segmentation using Template-Based Training and Visual Perception Augmentation |
Fang-Cheng Yeh et.al. |
2308.02363v1 |
null |
2023-08-04 |
T-UNet: Triplet UNet for Change Detection in High-Resolution Remote Sensing Images |
Huan Zhong et.al. |
2308.02356v1 |
link |
2023-08-04 |
Adapting to Change: Robust Counterfactual Explanations in Dynamic Data Landscapes |
Bardh Prenkaj et.al. |
2308.02353v1 |
link |
2023-08-04 |
Generative Image Priors for MRI Reconstruction Trained from Magnitude-Only Images |
Guanxiong Luo et.al. |
2308.02340v1 |
null |
2023-08-03 |
FROD: Robust Object Detection for Free |
Muhammad et.al. |
2308.01888v1 |
null |
2023-08-03 |
Similar image retrieval using Autoencoder. I. Automatic morphology classification of galaxies |
Eunsuk Seo et.al. |
2308.01871v1 |
null |
2023-08-03 |
Tag Prediction of Competitive Programming Problems using Deep Learning Techniques |
Taha Lokat et.al. |
2308.01863v1 |
null |
2023-08-03 |
URET: Universal Robustness Evaluation Toolkit (for Evasion) |
Kevin Eykholt et.al. |
2308.01840v1 |
link |
2023-08-03 |
Distribution-Free Inference for the Regression Function of Binary Classification |
Ambrus Tamás et.al. |
2308.01835v1 |
null |
2023-08-03 |
Deep Neural Networks Fused with Textures for Image Classification |
Asish Bera et.al. |
2308.01813v1 |
null |
2023-08-03 |
Deep Learning-based Prediction of Stress and Strain Maps in Arterial Walls for Improved Cardiovascular Risk Assessment |
Yasin Shokrollahi1 et.al. |
2308.01771v1 |
null |
2023-08-03 |
Focus on Content not Noise: Improving Image Generation for Nuclei Segmentation by Suppressing Steganography in CycleGAN |
Jonas Utz et.al. |
2308.01769v1 |
null |
2023-08-03 |
A Novel Tensor Decomposition of arbitrary order based on Block Convolution with Reflective Boundary Conditions for Multi-Dimensional Data Analysis |
Mahdi Molavi et.al. |
2308.01768v1 |
null |
2023-08-03 |
NuInsSeg: A Fully Annotated Dataset for Nuclei Instance Segmentation in H&E-Stained Histological Images |
Amirreza Mahbod et.al. |
2308.01760v1 |
link |
2023-08-02 |
ELIXR: Towards a general purpose X-ray artificial intelligence system through alignment of large language models and radiology vision encoders |
Shawn Xu et.al. |
2308.01317v1 |
null |
2023-08-02 |
More Context, Less Distraction: Visual Classification by Inferring and Conditioning on Contextual Attributes |
Bang An et.al. |
2308.01313v1 |
link |
2023-08-02 |
Revisiting DETR Pre-training for Object Detection |
Yan Ma et.al. |
2308.01300v1 |
null |
2023-08-02 |
A Probabilistic Approach to Self-Supervised Learning using Cyclical Stochastic Gradient MCMC |
Masoumeh Javanbakhat et.al. |
2308.01271v1 |
null |
2023-08-02 |
Incorporating Season and Solar Specificity into Renderings made by a NeRF Architecture using Satellite Images |
Michael Gableman et.al. |
2308.01262v1 |
link |
2023-08-02 |
Quantum Imprint of the Anharmonic Oscillator |
Prisco Lo Chiatto et.al. |
2308.01244v1 |
null |
2023-08-03 |
CMUNeXt: An Efficient Medical Image Segmentation Network based on Large Kernel and Skip Fusion |
Fenghe Tang et.al. |
2308.01239v2 |
link |
2023-08-02 |
LSF-IDM: Lightweight Deep Learning Models for Automotive Intrusion Detection Model Based on Semantic Fusion |
Pengzhou Cheng et.al. |
2308.01237v1 |
null |
2023-08-02 |
JADES. The diverse population of infant Black Holes at 4<z<11: merging, tiny, poor, but mighty |
Roberto Maiolino et.al. |
2308.01230v1 |
null |
2023-08-02 |
TeachCLIP: Multi-Grained Teaching for Efficient Text-to-Video Retrieval |
Kaibin Tian et.al. |
2308.01217v1 |
null |
2023-08-01 |
Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models |
Cheng-Yu Hsieh et.al. |
2308.00675v1 |
null |
2023-08-01 |
Human-M3: A Multi-view Multi-modal Dataset for 3D Human Pose Estimation in Outdoor Scenes |
Bohao Fan et.al. |
2308.00628v1 |
link |
2023-08-01 |
NeRT: Implicit Neural Representations for General Unsupervised Turbulence Mitigation |
Weiyun Jiang et.al. |
2308.00622v1 |
null |
2023-08-01 |
Beyond One-Hot-Encoding: Injecting Semantics to Drive Image Classifiers |
Alan Perotti et.al. |
2308.00607v1 |
link |
2023-08-01 |
Relation-Aware Distribution Representation Network for Person Clustering with Multiple Modalities |
Kaijian Liu et.al. |
2308.00588v1 |
null |
2023-08-01 |
Gradient Scaling on Deep Spiking Neural Networks with Spike-Dependent Local Information |
Seongsik Park et.al. |
2308.00558v1 |
null |
2023-08-01 |
SF-IDS: An Imbalanced Semi-Supervised Learning Framework for Fine-grained Intrusion Detection |
Xinran Zheng et.al. |
2308.00542v1 |
null |
2023-08-01 |
Compressed Private Aggregation for Scalable and Robust Federated Learning over Massive Networks |
Natalie Lang et.al. |
2308.00540v1 |
link |
2023-08-01 |
Predicting Early Dropouts of an Active and Healthy Ageing App |
Vasileios Perifanis et.al. |
2308.00539v1 |
null |
2023-08-01 |
PressureTransferNet: Human Attribute Guided Dynamic Ground Pressure Profile Transfer using 3D simulated Pressure Maps |
Lala Shakti Swarup Ray et.al. |
2308.00538v1 |
null |
2023-07-31 |
A Quantized Interband Topological Index in Two-Dimensional Systems |
Tharindu Fernando et.al. |
2307.16893v1 |
null |
2023-07-31 |
Foundational Models for Fault Diagnosis of Electrical Motors |
Sriram Anbalagan et.al. |
2307.16891v1 |
null |
2023-07-31 |
Discovering Adaptable Symbolic Algorithms from Scratch |
Stephen Kelly et.al. |
2307.16890v1 |
null |
2023-07-31 |
Universal Adversarial Defense in Remote Sensing Based on Pre-trained Denoising Diffusion Models |
Weikang Yu et.al. |
2307.16865v1 |
null |
2023-07-31 |
Nonlinearity-induced topological phase transition characterized by the nonlinear Chern number |
Kazuki Sone et.al. |
2307.16827v1 |
null |
2023-07-31 |
Defense of Adversarial Ranking Attack in Text Retrieval: Benchmark and Baseline via Detection |
Xuanang Chen et.al. |
2307.16816v1 |
null |
2023-07-31 |
Capturing Co-existing Distortions in User-Generated Content for No-reference Video Quality Assessment |
Kun Yuan et.al. |
2307.16813v1 |
null |
2023-07-31 |
DoDo Learning: DOmain-DemOgraphic Transfer in Language Models for Detecting Abuse Targeted at Public Figures |
Hannah Rose Kirk et.al. |
2307.16811v1 |
null |
2023-07-31 |
DPMix: Mixture of Depth and Point Cloud Video Experts for 4D Action Segmentation |
Yue Zhang et.al. |
2307.16803v1 |
null |
2023-07-31 |
Classification with Deep Neural Networks and Logistic Loss |
Zihan Zhang et.al. |
2307.16792v1 |
null |
2023-07-28 |
Quantum-noise-limited optical neural networks operating at a few quanta per activation |
Shi-Yuan Ma et.al. |
2307.15712v1 |
null |
2023-07-31 |
MeMOTR: Long-Term Memory-Augmented Transformer for Multi-Object Tracking |
Ruopeng Gao et.al. |
2307.15700v2 |
null |
2023-07-28 |
PatchMixer: Rethinking network design to boost generalization for 3D point cloud understanding |
Davide Boscaini et.al. |
2307.15692v1 |
null |
2023-07-28 |
ODTlearn: A Package for Learning Optimal Decision Trees for Prediction and Prescription |
Patrick Vossler et.al. |
2307.15691v1 |
link |
2023-07-28 |
Dynamic Analysis and an Eigen Initializer for Recurrent Neural Networks |
Ran Dou et.al. |
2307.15679v1 |
null |
2023-07-28 |
Bayesian Time-Series Classifier for Decoding Simple Visual Stimuli from Intracranial Neural Activity |
Navid Ziaei et.al. |
2307.15672v1 |
null |
2023-07-28 |
Classifying core collapse supernova remnants by their morphology as shaped by the last exploding jets |
Noam Soker et.al. |
2307.15666v1 |
null |
2023-07-28 |
Multi-layer Aggregation as a key to feature-based OOD detection |
Benjamin Lambert et.al. |
2307.15647v1 |
null |
2023-07-28 |
Scale-aware Test-time Click Adaptation for Pulmonary Nodule and Mass Segmentation |
Zhihao Li et.al. |
2307.15645v1 |
link |
2023-07-28 |
TriadNet: Sampling-free predictive intervals for lesional volume in 3D brain MR images |
Benjamin Lambert et.al. |
2307.15638v1 |
null |
2023-07-27 |
PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point Tracking |
Yang Zheng et.al. |
2307.15055v1 |
null |
2023-07-27 |
A Transformer-based Approach for Arabic Offline Handwritten Text Recognition |
Saleh Momeni et.al. |
2307.15045v1 |
null |
2023-07-27 |
Drive Asymmetry, Convergence and the Origin of Turbulence in ICF Implosions |
Vincent A. Thomas et.al. |
2307.15028v1 |
null |
2023-07-27 |
Self-Supervised Graph Transformer for Deepfake Detection |
Aminollah Khormali et.al. |
2307.15019v1 |
null |
2023-07-27 |
The last patch for classifying shuffle groups |
Junyang Zhang et.al. |
2307.15012v1 |
null |
2023-07-27 |
Gzip versus bag-of-words for text classification with KNN |
Juri Opitz et.al. |
2307.15002v1 |
null |
2023-07-27 |
Incrementally-Computable Neural Networks: Efficient Inference for Dynamic Inputs |
Or Sharir et.al. |
2307.14988v1 |
null |
2023-07-27 |
Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models |
Ziyi Wang et.al. |
2307.14971v1 |
link |
2023-07-27 |
Federated Model Aggregation via Self-Supervised Priors for Highly Imbalanced Medical Image Classification |
Marawan Elbatel et.al. |
2307.14959v1 |
link |
2023-07-27 |
Multi-Source Domain Adaptation through Dataset Dictionary Learning in Wasserstein Space |
Eduardo Fernandes Montesuma et.al. |
2307.14953v1 |
null |
2023-07-26 |
MAMo: Leveraging Memory and Attention for Monocular Video Depth Estimation |
Rajeev Yasarla et.al. |
2307.14336v1 |
null |
2023-07-26 |
Event-based Vision for Early Prediction of Manipulation Actions |
Daniel Deniz et.al. |
2307.14332v1 |
null |
2023-07-26 |
Waypoint-Based Imitation Learning for Robotic Manipulation |
Lucy Xiaoyang Shi et.al. |
2307.14326v1 |
null |
2023-07-26 |
Unraveling the Complexity of Splitting Sequential Data: Tackling Challenges in Video and Time Series Analysis |
Diego Botache et.al. |
2307.14294v1 |
null |
2023-07-26 |
G2L: Semantically Aligned and Uniform Video Grounding via Geodesic and Game Theory |
Hongxiang Li et.al. |
2307.14277v1 |
null |
2023-07-26 |
Deepfake Image Generation for Improved Brain Tumor Segmentation |
Roa'a Al-Emaryeen et.al. |
2307.14273v1 |
null |
2023-07-26 |
Sim-to-Real Model-Based and Model-Free Deep Reinforcement Learning for Tactile Pushing |
Max Yang et.al. |
2307.14272v1 |
null |
2023-07-26 |
Artifact Restoration in Histology Images with Diffusion Probabilistic Models |
Zhenqi He et.al. |
2307.14262v1 |
link |
2023-07-26 |
Defending Adversarial Patches via Joint Region Localizing and Inpainting |
Junwen Chen et.al. |
2307.14242v1 |
null |
2023-07-26 |
DisguisOR: Holistic Face Anonymization for the Operating Room |
Lennart Bastian et.al. |
2307.14241v1 |
link |
2023-07-25 |
RED CoMETS: An ensemble classifier for symbolically represented multivariate time series |
Luca A. Bennett et.al. |
2307.13679v1 |
link |
2023-07-25 |
QuickQual: Lightweight, convenient retinal image quality scoring with off-the-shelf pretrained models |
Justin Engelmann et.al. |
2307.13646v1 |
link |
2023-07-25 |
Manifestly Covariant Worldline Actions from Coadjoint Orbits. Part I: Generalities and Vectorial Descriptions |
Thomas Basile et.al. |
2307.13644v1 |
null |
2023-07-25 |
Optical Flow boosts Unsupervised Localization and Segmentation |
Xinyu Zhang et.al. |
2307.13640v1 |
link |
2023-07-25 |
Insights into Cognitive Engagement: Comparing the Effectiveness of Game-Based and Video-Based Learning |
Shayla Sharmin et.al. |
2307.13637v1 |
null |
2023-07-25 |
Contributions to the Improvement of Question Answering Systems in the Biomedical Domain |
Mourad Sarrouti et.al. |
2307.13631v1 |
null |
2023-07-25 |
Chandra X-ray Observatory Observations of 13 Fermi LAT Sources |
Blagoy Rangelov et.al. |
2307.13594v1 |
null |
2023-07-25 |
Reinterpreting survival analysis in the universal approximator age |
Sören Dittmer et.al. |
2307.13579v1 |
link |
2023-07-25 |
PT$\mathrm{L}^{p}$: Partial Transport $\mathrm{L}^{p}$ Distances |
Xinran Liu et.al. |
2307.13571v1 |
null |
2023-07-25 |
Group Activity Recognition in Computer Vision: A Comprehensive Review, Challenges, and Future Perspectives |
Chuanchuan Wang et.al. |
2307.13541v1 |
null |
2023-07-24 |
Leveraging Label Variation in Large Language Models for Zero-Shot Text Classification |
Flor Miriam Plaza-del-Arco et.al. |
2307.12973v1 |
null |
2023-07-24 |
A Connection between One-Step Regularization and Critic Regularization in Reinforcement Learning |
Benjamin Eysenbach et.al. |
2307.12968v1 |
link |
2023-07-24 |
Audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature Alignment |
Sarah Ibrahimi et.al. |
2307.12964v1 |
null |
2023-07-24 |
Rule By Example: Harnessing Logical Rules for Explainable Hate Speech Detection |
Christopher Clarke et.al. |
2307.12935v1 |
link |
2023-07-25 |
Towards a Visual-Language Foundation Model for Computational Pathology |
Ming Y. Lu et.al. |
2307.12914v2 |
null |
2023-07-24 |
Dyn-E: Local Appearance Editing of Dynamic Neural Radiance Fields |
Shangzhan Zhang et.al. |
2307.12909v1 |
null |
2023-07-24 |
Conditional Residual Coding: A Remedy for Bottleneck Problems in Conditional Inter Frame Coding |
Fabian Brand et.al. |
2307.12864v1 |
null |
2023-07-24 |
Multiscale Video Pretraining for Long-Term Activity Forecasting |
Reuben Tan et.al. |
2307.12854v1 |
null |
2023-07-25 |
Spatiotemporal Modeling Encounters 3D Medical Image Analysis: Slice-Shift UNet with Multi-View Fusion |
C. I. Ugwu et.al. |
2307.12853v2 |
null |
2023-07-24 |
Early Neuron Alignment in Two-layer ReLU Networks with Small Initialization |
Hancheng Min et.al. |
2307.12851v1 |
null |
2023-07-21 |
Advanced Monte Carlo simulation techniques to study polymers under equilibrium conditions |
Monika Angwani et.al. |
2307.11722v1 |
null |
2023-07-21 |
Deep Learning Hyperspectral Pansharpening on large scale PRISMA dataset |
Simone Zini et.al. |
2307.11666v1 |
null |
2023-07-21 |
FEDD -- Fair, Efficient, and Diverse Diffusion-based Lesion Segmentation and Malignancy Classification |
Héctor Carrión et.al. |
2307.11654v1 |
null |
2023-07-21 |
Sparse Cholesky factorization by greedy conditional selection |
Stephen Huan et.al. |
2307.11648v1 |
link |
2023-07-24 |
Morphological Image Analysis and Feature Extraction for Reasoning with AI-based Defect Detection and Classification Models |
Jiajun Zhang et.al. |
2307.11643v2 |
null |
2023-07-21 |
Deep Reinforcement Learning Based System for Intraoperative Hyperspectral Video Autofocusing |
Charlie Budd et.al. |
2307.11638v1 |
null |
2023-07-21 |
Computational Image Formation |
Stanley H. Chan et.al. |
2307.11635v1 |
null |
2023-07-21 |
Finding Optimal Diverse Feature Sets with Alternative Feature Selection |
Jakob Bach et.al. |
2307.11607v1 |
null |
2023-07-21 |
Cascaded multitask U-Net using topological loss for vessel segmentation and centerline extraction |
Pierre Rougé et.al. |
2307.11603v1 |
null |
2023-07-21 |
Mixbiotic society measures: Assessment of community well-going as living system |
Takeshi Kato et.al. |
2307.11594v1 |
null |
2023-07-20 |
GLSFormer: Gated - Long, Short Sequence Transformer for Step Recognition in Surgical Videos |
Nisarg A. Shah et.al. |
2307.11081v1 |
link |
2023-07-20 |
Driving Policy Prediction based on Deep Learning Models |
Fuxiao Liu et.al. |
2307.11058v1 |
null |
2023-07-20 |
Cascade-DETR: Delving into High-Quality Universal Object Detection |
Mingqiao Ye et.al. |
2307.11035v1 |
link |
2023-07-20 |
Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification |
Neel Guha et.al. |
2307.11031v1 |
null |
2023-07-20 |
Cluster-aware Semi-supervised Learning: Relational Knowledge Distillation Provably Learns Clustering |
Yijun Dong et.al. |
2307.11030v1 |
null |
2023-07-20 |
Multi-objective point cloud autoencoders for explainable myocardial infarction prediction |
Marcel Beetz et.al. |
2307.11017v1 |
null |
2023-07-20 |
Treatment And Follow-Up Guidelines For Multiple Brain Metastases: A Systematic Review |
Ana Sofia Santos et.al. |
2307.11016v1 |
null |
2023-07-21 |
Dense Sample Deep Learning |
Stephen Josè Hanson et.al. |
2307.10991v2 |
null |
2023-07-20 |
Deep Spiking-UNet for Image Processing |
Hebei Li et.al. |
2307.10974v1 |
link |
2023-07-20 |
Spinal nerve segmentation method and dataset construction in endoscopic surgical scenarios |
Shaowu Peng et.al. |
2307.10955v1 |
link |
2023-07-19 |
DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity Human-centric Rendering |
Wei Cheng et.al. |
2307.10173v1 |
link |
2023-07-19 |
Adversarial Latent Autoencoder with Self-Attention for Structural Image Synthesis |
Jiajie Fan et.al. |
2307.10166v1 |
null |
2023-07-19 |
Leveraging Visemes for Better Visual Speech Representation and Lip Reading |
Javad Peymanfard et.al. |
2307.10157v1 |
null |
2023-07-19 |
Remarks on a theorem of Pink in presence of bad reduction |
Wojciech Gajda et.al. |
2307.10140v1 |
null |
2023-07-19 |
Gradient Sparsification For Masked Fine-Tuning of Transformers |
James O' Neill et.al. |
2307.10098v1 |
null |
2023-07-19 |
Boundary-Refined Prototype Generation: A General End-to-End Paradigm for Semi-Supervised Semantic Segmentation |
Junhao Dong et.al. |
2307.10097v1 |
null |
2023-07-19 |
Make-A-Volume: Leveraging Latent Diffusion Models for Cross-Modality 3D Brain MRI Synthesis |
Lingting Zhu et.al. |
2307.10094v1 |
null |
2023-07-19 |
Divert More Attention to Vision-Language Object Tracking |
Mingzhe Guo et.al. |
2307.10046v1 |
link |
2023-07-19 |
A non-monotone extra-gradient trust-region method with noisy oracles |
Natasa Krejic et.al. |
2307.10038v1 |
null |
2023-07-20 |
Class Attention to Regions of Lesion for Imbalanced Medical Image Recognition |
Jia-Xin Zhuang et.al. |
2307.10036v2 |
null |
2023-07-18 |
AnyDoor: Zero-shot Object-level Image Customization |
Xi Chen et.al. |
2307.09481v1 |
null |
2023-07-18 |
FACTS: Facial Animation Creation using the Transfer of Styles |
Jack Saunders et.al. |
2307.09480v1 |
null |
2023-07-18 |
GroupLane: End-to-End 3D Lane Detection with Channel-wise Grouping |
Zhuoling Li et.al. |
2307.09472v1 |
null |
2023-07-18 |
Smooth Attention for Deep Multiple Instance Learning: Application to CT Intracranial Hemorrhage Detection |
Yunan Wu et.al. |
2307.09457v1 |
link |
2023-07-19 |
A comparative analysis of SRGAN models |
Fatemeh Rezapoor Nikroo et.al. |
2307.09456v2 |
null |
2023-07-19 |
Pseudo Outlier Exposure for Out-of-Distribution Detection using Pretrained Transformers |
Jaeyoung Kim et.al. |
2307.09455v2 |
null |
2023-07-18 |
Measuring Student Behavioral Engagement using Histogram of Actions |
Ahmed Abdelkawy et.al. |
2307.09420v1 |
null |
2023-07-18 |
Is this Snippet Written by ChatGPT? An Empirical Study with a CodeBERT-Based Classifier |
Phuong T. Nguyen et.al. |
2307.09381v1 |
null |
2023-07-18 |
CertPri: Certifiable Prioritization for Deep Neural Networks via Movement Cost in Feature Space |
Haibin Zheng et.al. |
2307.09375v1 |
null |
2023-07-18 |
Enhancing Pattern Classification in Support Vector Machines through Matrix Formulation |
Sambhav Jain Reshma Rastogi et.al. |
2307.09372v1 |
null |
2023-07-17 |
Diffusion Models Beat GANs on Image Classification |
Soumik Mukhopadhyay et.al. |
2307.08702v1 |
null |
2023-07-17 |
Neural Video Depth Stabilizer |
Yiran Wang et.al. |
2307.08695v1 |
link |
2023-07-17 |
SEMI-DiffusionInst: A Diffusion Model Based Approach for Semiconductor Defect Classification and Segmentation |
Vic De Ridder et.al. |
2307.08693v1 |
null |
2023-07-17 |
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning |
Tri Dao et.al. |
2307.08691v1 |
link |
2023-07-17 |
Implementation of a perception system for autonomous vehicles using a detection-segmentation network in SoC FPGA |
Maciej Baczmanski et.al. |
2307.08682v1 |
null |
2023-07-17 |
Neural Image Compression: Generalization, Robustness, and Spectral Biases |
Kelsey Lieberman et.al. |
2307.08657v1 |
null |
2023-07-17 |
PolyGNN: Polyhedron-based Graph Neural Network for 3D Building Reconstruction from Point Clouds |
Zhaiyu Chen et.al. |
2307.08636v1 |
null |
2023-07-17 |
Deficiency-Aware Masked Transformer for Video Inpainting |
Yongsheng Yu et.al. |
2307.08629v1 |
link |
2023-07-17 |
BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs |
Yang Zhao et.al. |
2307.08581v1 |
null |
2023-07-18 |
Deep Learning with Passive Optical Nonlinear Mapping |
Fei Xia et.al. |
2307.08558v2 |
null |
2023-07-14 |
Expressive Monotonic Neural Networks |
Ouail Kitouni et.al. |
2307.07512v1 |
link |
2023-07-14 |
Streaming CTR Prediction: Rethinking Recommendation Task for Real-World Streaming Data |
Qi-Wei Wang et.al. |
2307.07509v1 |
null |
2023-07-14 |
Brain Tumor Detection using Convolutional Neural Networks with Skip Connections |
Aupam Hamran et.al. |
2307.07503v1 |
null |
2023-07-14 |
TALL: Thumbnail Layout for Deepfake Video Detection |
Yuting Xu et.al. |
2307.07494v1 |
null |
2023-07-14 |
DreamTeacher: Pretraining Image Backbones with Deep Generative Models |
Daiqing Li et.al. |
2307.07487v1 |
null |
2023-07-14 |
Multimodal Distillation for Egocentric Action Recognition |
Gorjan Radevski et.al. |
2307.07483v1 |
null |
2023-07-14 |
Dual-Query Multiple Instance Learning for Dynamic Meta-Embedding based Tumor Classification |
Simon Holdenried-Krafft et.al. |
2307.07482v1 |
null |
2023-07-14 |
Passage-times for partially-homogeneous reflected random walks on the quadrant |
Conrado da Costa et.al. |
2307.07458v1 |
null |
2023-07-14 |
An equivariant surgery classification of $C_p$-surfaces |
Kelly Pohland et.al. |
2307.07446v1 |
null |
2023-07-14 |
Can Large Language Models Empower Molecular Property Prediction? |
Chen Qian et.al. |
2307.07443v1 |
link |
2023-07-13 |
Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition |
Syed Talal Wasim et.al. |
2307.06947v1 |
link |
2023-07-13 |
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation |
Yi Wang et.al. |
2307.06942v1 |
link |
2023-07-13 |
Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation |
Yingqing He et.al. |
2307.06940v1 |
link |
2023-07-13 |
DRAGON: A Dialogue-Based Robot for Assistive Navigation with Visual Language Grounding |
Shuijing Liu et.al. |
2307.06924v1 |
null |
2023-07-13 |
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks |
Liam Collins et.al. |
2307.06887v1 |
null |
2023-07-13 |
LVLane: Deep Learning for Lane Detection and Classification in Challenging Conditions |
Zillur Rahman et.al. |
2307.06853v1 |
link |
2023-07-13 |
Leveraging Vision-Language Foundation Models for Fine-Grained Downstream Tasks |
Denis Coquenet et.al. |
2307.06795v1 |
link |
2023-07-13 |
Robotic surface exploration with vision and tactile sensing for cracks detection and characterisation |
Francesca Palermo et.al. |
2307.06784v1 |
null |
2023-07-13 |
Generalizing Supervised Deep Learning MRI Reconstruction to Multiple and Unseen Contrasts using Meta-Learning Hypernetworks |
Sriprabha Ramanarayanan et.al. |
2307.06771v1 |
link |
2023-07-13 |
Pairs of inner projections and two applications |
Ramlal Debnath et.al. |
2307.06744v1 |
null |
2023-07-12 |
Deep Learning of Crystalline Defects from TEM images: A Solution for the Problem of "Never Enough Training Data" |
Kishan Govind et.al. |
2307.06322v1 |
null |
2023-07-12 |
A geometric classification of rod complements in the 3-torus |
Connie On Yu Hui et.al. |
2307.06317v1 |
null |
2023-07-12 |
Facial Reenactment Through a Personalized Generator |
Ariel Elazary et.al. |
2307.06307v1 |
null |
2023-07-12 |
Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution |
Mostafa Dehghani et.al. |
2307.06304v1 |
null |
2023-07-12 |
Feature Embeddings from Large-Scale Acoustic Bird Classifiers Enable Few-Shot Transfer Learning |
Burooj Ghani et.al. |
2307.06292v1 |
null |
2023-07-12 |
Stochastic Light Field Holography |
Florian Schiffers et.al. |
2307.06277v1 |
null |
2023-07-12 |
Machine learning and Topological data analysis identify unique features of human papillae in 3D scans |
Rayna Andreeva et.al. |
2307.06255v1 |
null |
2023-07-12 |
On the Importance of Denoising when Learning to Compress Images |
Benoit Brummer et.al. |
2307.06233v1 |
link |
2023-07-12 |
Ashaar: Automatic Analysis and Generation of Arabic Poetry Using Deep Learning Approaches |
Zaid Alyafeai et.al. |
2307.06218v1 |
link |
2023-07-12 |
Local Conditional Neural Fields for Versatile and Generalizable Large-Scale Reconstructions in Computational Imaging |
Hao Wang et.al. |
2307.06207v1 |
null |
2023-07-11 |
Fractonic Higher-Order Topological Phases in Open Quantum Systems |
Jian-Hao Zhang et.al. |
2307.05474v1 |
null |
2023-07-11 |
Differentiable Blocks World: Qualitative 3D Decomposition by Rendering Primitives |
Tom Monnier et.al. |
2307.05473v1 |
null |
2023-07-11 |
EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone |
Shraman Pramanick et.al. |
2307.05463v1 |
null |
2023-07-11 |
Improving the Security of Smartwatch Payment with Deep Learning |
George Webber et.al. |
2307.05437v1 |
null |
2023-07-11 |
One-Versus-Others Attention: Scalable Multimodal Integration |
Michal Golovanevsky et.al. |
2307.05435v1 |
link |
2023-07-11 |
Identifying Acoustic Wave Sources on the Sun. II. Improved Filter Techniques for Source Wavefield Seismology |
Shah Mohammad Bahauddin et.al. |
2307.05433v1 |
null |
2023-07-11 |
Effective Whitney Stratification of Real Algebraic Varieties |
Martin Helmer et.al. |
2307.05427v1 |
null |
2023-07-11 |
Domain-Agnostic Neural Architecture for Class Incremental Continual Learning in Document Processing Platform |
Mateusz Wójcik et.al. |
2307.05399v1 |
link |
2023-07-11 |
ShredGP: Guitarist Style-Conditioned Tablature Generation |
Pedro Sarmento et.al. |
2307.05324v1 |
null |
2023-07-11 |
Class Instance Balanced Learning for Long-Tailed Classification |
Marc-Antoine Lavoie et.al. |
2307.05322v1 |
null |
2023-07-10 |
Semantic-SAM: Segment and Recognize Anything at Any Granularity |
Feng Li et.al. |
2307.04767v1 |
link |
2023-07-10 |
Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos |
Sagnik Majumder et.al. |
2307.04760v1 |
null |
2023-07-10 |
Shelving, Stacking, Hanging: Relational Pose Diffusion for Multi-modal Rearrangement |
Anthony Simeonov et.al. |
2307.04751v1 |
null |
2023-07-10 |
RoCo: Dialectic Multi-Robot Collaboration with Large Language Models |
Zhao Mandi et.al. |
2307.04738v1 |
link |
2023-07-10 |
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning |
Yuwei Guo et.al. |
2307.04725v1 |
null |
2023-07-10 |
Quark/Gluon Discrimination and Top Tagging with Dual Attention Transformer |
Minxuan He et.al. |
2307.04723v1 |
null |
2023-07-10 |
CVPR MultiEarth 2023 Deforestation Estimation Challenge:SpaceVision4Amazon |
Sunita Arya et.al. |
2307.04715v1 |
null |
2023-07-10 |
Multimodal brain age estimation using interpretable adaptive population-graph learning |
Kyriaki-Margarita Bintsi et.al. |
2307.04639v1 |
null |
2023-07-10 |
Learning Fine Pinch-Grasp Skills using Tactile Sensing from Real Demonstration Data |
Xiaofeng Mao et.al. |
2307.04619v1 |
null |
2023-07-10 |
Weakly-supervised positional contrastive learning: application to cirrhosis classification |
Emma Sarfati et.al. |
2307.04617v1 |
null |
2023-07-07 |
On the representation theory of cyclic and dihedral quandles |
Mohamed Elhamdadi et.al. |
2307.03728v1 |
null |
2023-07-07 |
Polybot: Training One Policy Across Robots While Embracing Variability |
Jonathan Yang et.al. |
2307.03719v1 |
null |
2023-07-07 |
Motion Magnification in Robotic Sonography: Enabling Pulsation-Aware Artery Segmentation |
Dianye Huang et.al. |
2307.03698v1 |
null |
2023-07-07 |
Detecting the Sensing Area of A Laparoscopic Probe in Minimally Invasive Cancer Surgery |
Baoru Huang et.al. |
2307.03662v1 |
null |
2023-07-07 |
Physical-aware Cross-modal Adversarial Network for Wearable Sensor-based Human Action Recognition |
Jianyuan Ni et.al. |
2307.03638v1 |
null |
2023-07-07 |
VesselVAE: Recursive Variational Autoencoders for 3D Blood Vessel Synthesis |
Paula Feldman et.al. |
2307.03592v1 |
null |
2023-07-07 |
SpawnNet: Learning Generalizable Visuomotor Skills from Pre-trained Networks |
Xingyu Lin et.al. |
2307.03567v1 |
null |
2023-07-07 |
VariGrad: A Novel Feature Vector Architecture for Geometric Deep Learning on Unregistered Data |
Emmanuel Hartman et.al. |
2307.03553v1 |
null |
2023-07-07 |
TBGC: Task-level Backbone-Oriented Gradient Clip for Multi-Task Foundation Model Learning |
Zelun Zhang et.al. |
2307.03465v1 |
null |
2023-07-07 |
A Deep Active Contour Model for Delineating Glacier Calving Fronts |
Konrad Heidler et.al. |
2307.03461v1 |
null |
2023-07-06 |
Synthesizing Artistic Cinemagraphs from Text |
Aniruddha Mahapatra et.al. |
2307.03190v1 |
null |
2023-07-06 |
Long-term follow-up observations of extreme coronal line emitting galaxies |
Peter Clark et.al. |
2307.03182v1 |
null |
2023-07-06 |
Push Past Green: Learning to Look Behind Plant Foliage by Moving It |
Xiaoyu Zhang et.al. |
2307.03175v1 |
null |
2023-07-06 |
VideoGLUE: Video General Understanding Evaluation of Foundation Models |
Liangzhe Yuan et.al. |
2307.03166v1 |
null |
2023-07-06 |
Can Domain Adaptation Improve Accuracy and Fairness of Skin Lesion Classification? |
Janet Wang et.al. |
2307.03157v1 |
null |
2023-07-06 |
MultiVENT: Multilingual Videos of Events with Aligned Natural Text |
Kate Sanders et.al. |
2307.03153v1 |
null |
2023-07-06 |
Topology-Aware Loss for Aorta and Great Vessel Segmentation in Computed Tomography Images |
Seher Ozcelik et.al. |
2307.03137v1 |
null |
2023-07-06 |
Distilling Large Vision-Language Model with Out-of-Distribution Generalizability |
Xuanlin Li et.al. |
2307.03135v1 |
link |
2023-07-06 |
Benchmarking Test-Time Adaptation against Distribution Shifts in Image Classification |
Yongcan Yu et.al. |
2307.03133v1 |
link |
2023-07-06 |
VisKoP: Visual Knowledge oriented Programming for Interactive Knowledge Base Question Answering |
Zijun Yao et.al. |
2307.03130v1 |
null |
2023-07-05 |
Building Cooperative Embodied Agents Modularly with Large Language Models |
Hongxin Zhang et.al. |
2307.02485v1 |
null |
2023-07-05 |
Elastic Decision Transformer |
Yueh-Hua Wu et.al. |
2307.02484v1 |
null |
2023-07-05 |
What Matters in Training a GPT4-Style Language Model with Multimodal Inputs? |
Yan Zeng et.al. |
2307.02469v1 |
null |
2023-07-05 |
Supersymmetric asymptotically locally AdS$_5$ gravitational solitons |
Turkuler Durgut et.al. |
2307.02466v1 |
null |
2023-07-05 |
AxonCallosumEM Dataset: Axon Semantic Segmentation of Whole Corpus Callosum cross section from EM Images |
Ao Cheng et.al. |
2307.02464v1 |
null |
2023-07-05 |
Expert-Agnostic Ultrasound Image Quality Assessment using Deep Variational Clustering |
Deepak Raina et.al. |
2307.02462v1 |
null |
2023-07-05 |
LLCaps: Learning to Illuminate Low-Light Capsule Endoscopy with Curved Wavelet Attention and Reverse Diffusion |
Long Bai et.al. |
2307.02452v1 |
link |
2023-07-05 |
On Deep Learning Classification of Digitally Modulated Signals Using Raw I/Q Data |
John A. Snoap et.al. |
2307.02450v1 |
null |
2023-07-05 |
Vulnerable Source Code Detection using SonarCloud Code Analysis |
Alifia Puspaningrum et.al. |
2307.02446v1 |
null |
2023-07-05 |
Base Layer Efficiency in Scalable Human-Machine Coding |
Yalda Foroutan et.al. |
2307.02430v1 |
null |
2023-07-03 |
Real-time Monocular Full-body Capture in World Space via Sequential Proxy-to-Motion Learning |
Yuxiang Zhang et.al. |
2307.01200v1 |
null |
2023-07-03 |
Segment Anything Meets Point Tracking |
Frano Rajič et.al. |
2307.01197v1 |
link |
2023-07-03 |
Online nearest neighbor classification |
Sanjoy Dasgupta et.al. |
2307.01170v1 |
null |
2023-07-03 |
Don't freeze: Finetune encoders for better Self-Supervised HAR |
Vitor Fortes Rey et.al. |
2307.01168v1 |
null |
2023-07-03 |
Characteristic signatures of accreting binary black holes produced by eccentric minidisks |
John Ryan Westernacher-Schneider et.al. |
2307.01154v1 |
null |
2023-07-03 |
Integral cohomology rings of weighted Grassmann orbifolds and Rigidity properties |
Koushik Brahma et.al. |
2307.01153v1 |
null |
2023-07-03 |
Investigating Data Memorization in 3D Latent Diffusion Models for Medical Image Synthesis |
Salman Ul Hassan Dar et.al. |
2307.01148v1 |
null |
2023-07-05 |
AVSegFormer: Audio-Visual Segmentation with Transformer |
Shengyi Gao et.al. |
2307.01146v2 |
link |
2023-07-03 |
Cross-modality Attention Adapter: A Glioma Segmentation Fine-tuning Method for SAM Using Multimodal Brain MR Images |
Xiaoyu Shi et.al. |
2307.01124v1 |
null |
2023-07-03 |
Supervised Manifold Learning via Random Forest Geometry-Preserving Proximities |
Jake S. Rhodes et.al. |
2307.01077v1 |
null |
2023-07-03 |
SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs |
Lijun Yu et.al. |
2306.17842v2 |
null |
2023-06-30 |
Learning Evacuee Models from Robot-Guided Emergency Evacuation Experiments |
Mollik Nayyar et.al. |
2306.17824v1 |
null |
2023-06-30 |
Act3D: Infinite Resolution Action Detection Transformer for Robotic Manipulation |
Theophile Gervet et.al. |
2306.17817v1 |
null |
2023-06-30 |
Topologically Attributed Graphs for Shape Discrimination |
Justin Curry et.al. |
2306.17805v1 |
null |
2023-06-30 |
Vision Through the Veil: Differential Privacy in Federated Learning for Medical Image Classification |
Kishore Babu Nampalle et.al. |
2306.17794v1 |
null |
2023-06-30 |
Precision Anti-Cancer Drug Selection via Neural Ranking |
Vishal Dey et.al. |
2306.17771v1 |
null |
2023-06-30 |
Improved NL2SQL based on Multi-layer Expert Network |
Chenduo Hao et.al. |
2306.17727v1 |
null |
2023-06-30 |
Content-Preserving Diffusion Model for Unsupervised AS-OCT image Despeckling |
Li Sanqian et.al. |
2306.17717v1 |
null |
2023-06-30 |
Evaluation of the Benefits of Zero Velocity Update in Decentralized EKF-Based Cooperative Localization Algorithms for GNSS-Denied Multi-Robot Systems |
Cagri Kilic et.al. |
2306.17703v1 |
null |
2023-06-30 |
Generalized Time Warping Invariant Dictionary Learning for Time Series Classification and Clustering |
Ruiyu Xu et.al. |
2306.17690v1 |
null |
2023-06-29 |
An Efficient General-Purpose Modular Vision Model via Multi-Task Heterogeneous Training |
Zitian Chen et.al. |
2306.17165v1 |
null |
2023-06-29 |
Can Machines Garden? Systematically Comparing the AlphaGarden vs. Professional Horticulturalists |
Simeon Adebola et.al. |
2306.17162v1 |
null |
2023-06-29 |
FogROS2-SGC: A ROS2 Cloud Robotics Platform for Secure Global Connectivity |
Kaiyuan Chen et.al. |
2306.17157v1 |
null |
2023-06-29 |
Orbit Classification of asteroids using implementation of radial Basis Function on Support Vector Machines |
Yashvir Tiberwal et.al. |
2306.17138v1 |
null |
2023-06-29 |
On separably integrable symmetric convex bodies |
Vladyslav Yaskin et.al. |
2306.17127v1 |
null |
2023-06-29 |
PVP: Personalized Video Prior for Editable Dynamic Portraits using StyleGAN |
Kai-En Lin et.al. |
2306.17123v1 |
null |
2023-06-29 |
Learning Nuclei Representations with Masked Image Modelling |
Piotr Wójcik et.al. |
2306.17116v1 |
null |
2023-06-29 |
Deep Ensemble for Rotorcraft Attitude Prediction |
Hikmat Khan et.al. |
2306.17104v1 |
null |
2023-06-29 |
Twice Binnable Color Filter Arrays |
Mritunjay Singh et.al. |
2306.17078v1 |
null |
2023-06-29 |
Extremal behavior of reduced type of one dimensional rings |
Sarasij Maitra et.al. |
2306.17069v1 |
null |
2023-06-28 |
Class Numbers, Congruent Numbers and Umbral Moonshine |
Miranda C. N. Cheng et.al. |
2306.16414v1 |
null |
2023-06-28 |
Information-Computation Tradeoffs for Learning Margin Halfspaces with Random Classification Noise |
Ilias Diakonikolas et.al. |
2306.16352v1 |
null |
2023-06-28 |
Accurate, uncertainty-aware classification of molecular chemical motifs from multi-modal X-ray absorption spectroscopy |
Matthew R. Carbone et.al. |
2306.16349v1 |
null |
2023-06-28 |
DoseDiff: Distance-aware Diffusion Model for Dose Prediction in Radiotherapy |
Yiwen Zhang et.al. |
2306.16324v1 |
null |
2023-06-28 |
Universal theory of spin-momentum-orbital-site locking |
Yuntian Liu et.al. |
2306.16312v1 |
null |
2023-06-28 |
Generalizing Surgical Instruments Segmentation to Unseen Domains with One-to-Many Synthesis |
An Wang et.al. |
2306.16285v1 |
link |
2023-06-28 |
Emotion Analysis of Tweets Banning Education in Afghanistan |
Mohammad Ali Hussiny et.al. |
2306.16268v1 |
null |
2023-06-28 |
Reconfigurable Robot Control Using Flexible Coupling Mechanisms |
Sha Yi et.al. |
2306.16265v1 |
null |
2023-06-28 |
Latent SDEs on Homogeneous Spaces |
Sebastian Zeng et.al. |
2306.16248v1 |
null |
2023-06-28 |
Investigating the Uncanny Valley Phenomenon Through the Temporal Dynamics of Neural Responses to Virtual Characters |
Chiara Gorlini et.al. |
2306.16233v1 |
null |
2023-06-27 |
Physion++: Evaluating Physical Scene Understanding that Requires Online Inference of Different Physical Properties |
Hsiao-Yu Tung et.al. |
2306.15668v1 |
null |
2023-06-27 |
Enhancing Representation Learning on High-Dimensional, Small-Size Tabular Data: A Divide and Conquer Method with Ensembled VAEs |
Navindu Leelarathna et.al. |
2306.15661v1 |
null |
2023-06-27 |
Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos |
Chiori Hori et.al. |
2306.15644v1 |
null |
2023-06-27 |
Biclustering random matrix partitions with an application to classification of forensic body fluids |
Chieh-Hsi Wu et.al. |
2306.15622v1 |
null |
2023-06-27 |
Recurrent Neural Network-coupled SPAD TCSPC System for Real-time Fluorescence Lifetime Imaging |
Yang Lin et.al. |
2306.15599v1 |
null |
2023-06-27 |
Optimizing Credit Limit Adjustments Under Adversarial Goals Using Reinforcement Learning |
Sherly Alfonso-Sánchez et.al. |
2306.15585v1 |
null |
2023-06-27 |
Parity doublet model for baryon octets: diquark classifications and mass hierarchy based on the quark-line diagram |
Takuya Minamikawa et.al. |
2306.15564v1 |
null |
2023-06-27 |
You Can Mask More For Extremely Low-Bitrate Image Compression |
Anqi Li et.al. |
2306.15561v1 |
link |
2023-06-27 |
A Survey on Deep Learning Hardware Accelerators for Heterogeneous HPC Platforms |
Cristina Silvano et.al. |
2306.15552v1 |
null |
2023-06-27 |
Self-supervised Learning of Event-guided Video Frame Interpolation for Rolling Shutter Frames |
Yunfan Lu et.al. |
2306.15507v1 |
null |
2023-06-26 |
FunQA: Towards Surprising Video Comprehension |
Binzhu Xie et.al. |
2306.14899v1 |
link |
2023-06-26 |
Mapping out phase diagrams with generative classifiers |
Julian Arnold et.al. |
2306.14894v1 |
null |
2023-06-26 |
Fuzzy-Conditioned Diffusion and Diffusion Projection Attention Applied to Facial Image Correction |
Majed El Helou et.al. |
2306.14891v1 |
link |
2023-06-26 |
A Fully Unsupervised Instance Segmentation Technique for White Blood Cell Images |
Shrijeet Biswas et.al. |
2306.14875v1 |
null |
2023-06-26 |
ANYmal Parkour: Learning Agile Navigation for Quadrupedal Robots |
David Hoeller et.al. |
2306.14874v1 |
null |
2023-06-26 |
Leveraging Task Structures for Improved Identifiability in Neural Network Representations |
Wenlin Chen et.al. |
2306.14861v1 |
null |
2023-06-26 |
ViNT: A Foundation Model for Visual Navigation |
Dhruv Shah et.al. |
2306.14846v1 |
null |
2023-06-26 |
An open-source robust machine learning platform for real-time detection and classification of 2D material flakes |
Jan-Lucas Uslu et.al. |
2306.14845v1 |
null |
2023-06-26 |
A Flyweight CNN with Adaptive Decoder for Schistosoma mansoni Egg Detection |
Leonardo de Melo Joao et.al. |
2306.14840v1 |
null |
2023-06-26 |
Label-Aware Hyperbolic Embeddings for Fine-grained Emotion Classification |
Chih-Yao Chen et.al. |
2306.14822v1 |
link |
2023-06-23 |
Adversarial Robustness Certification for Bayesian Neural Networks |
Matthew Wicker et.al. |
2306.13614v1 |
link |
2023-06-23 |
TACOformer:Token-channel compounded Cross Attention for Multimodal Emotion Recognition |
Xinda Li et.al. |
2306.13592v1 |
null |
2023-06-23 |
Estimating Residential Solar Potential Using Aerial Data |
Ross Goroshin et.al. |
2306.13564v1 |
null |
2023-06-23 |
Efficient Model Selection for Predictive Pattern Mining Model by Safe Pattern Pruning |
Takumi Yoshida et.al. |
2306.13561v1 |
null |
2023-06-26 |
FPGA Implementation of Convolutional Neural Network for Real-Time Handwriting Recognition |
Shichen Qiao et.al. |
2306.13557v2 |
link |
2023-06-23 |
Comparing the Efficacy of Fine-Tuning and Meta-Learning for Few-Shot Policy Imitation |
Massimiliano Patacchiola et.al. |
2306.13554v1 |
link |
2023-06-23 |
Manifold Contrastive Learning with Variational Lie Group Operators |
Kion Fallah et.al. |
2306.13544v1 |
null |
2023-06-23 |
Torsion Graph Neural Networks |
Cong Shen et.al. |
2306.13541v1 |
link |
2023-06-23 |
Topological learning for the classification of disorder: an application to the design of metasurfaces |
Tristan Madeleine et.al. |
2306.13540v1 |
null |
2023-06-23 |
WBCAtt: A White Blood Cell Dataset Annotated with Detailed Morphological Attributes |
Satoshi Tsutsui et.al. |
2306.13531v1 |
link |
2023-06-22 |
A Comparison of Time-based Models for Multimodal Emotion Recognition |
Ege Kesim et.al. |
2306.13076v1 |
null |
2023-06-22 |
Auditing Predictive Models for Intersectional Biases |
Kate S. Boxer et.al. |
2306.13064v1 |
null |
2023-06-22 |
Impacts and Risk of Generative AI Technology on Cyber Defense |
Subash Neupane et.al. |
2306.13033v1 |
null |
2023-06-22 |
Toward Automated Detection of Microbleeds with Anatomical Scale Localization: A Complete Clinical Diagnosis Support Using Deep Learning |
Jun-Ho Kim et.al. |
[2306.13020v1 |
|