star technology and research
- How Disney Packed Big Emotion Into a Little Robotspectrum.ieee.org How Disney Packed Big Emotion Into a Little Robot
Melding animation and reinforcement learning for free-ranging emotive performances
- GitHub - AnMnv/eBook: Beautiful LaTeX book with examples, open-source eBookgithub.com GitHub - AnMnv/eBook: LaTeX book with examples, open-source eBook
LaTeX book with examples, open-source eBook. Contribute to AnMnv/eBook development by creating an account on GitHub.
- gs2d(Generic Serial-bus Servo Driver) Library - karakuri-products/gs2d: gs2d repositorygithub.com GitHub - karakuri-products/gs2d: gs2d repository
gs2d repository. Contribute to karakuri-products/gs2d development by creating an account on GitHub.
- The Little Computer That Could - #AI, #SBCwww.hackster.io The Little Computer That Could
The powerful Stable Diffusion XL 1.0 image generator can now run on a Raspberry Pi Zero 2 W with 512 MB of RAM using OnnxStream.
The powerful Stable Diffusion XL 1.0 image generator can now run on a Raspberry Pi Zero 2 W with 512 MB of RAM using OnnxStream.
- Animate 3D Models with AI - #AI, #VRapp.anything.world Anything World
3D Animation and Automated Rigging platform powered by Machine Learning. Unlimited 3D models for games, apps and the metaverse.
- SVG Customization - #AIintchous.github.io Text-Guided Vector Graphics Customization
Text-Guided Vector Graphics Customization
- AudioLDM2: Text-to-Audio Generation with Latent Diffusion Models - Speech Research - #AI
Although audio generation shares commonalities across different types of audio, such as speech, music, and sound effects, designing models for each type requires careful consideration of specific objectives and biases that can significantly differ from those of other types. To bring us closer to a unified perspective of audio generation, this paper proposes a framework that utilizes the same learning method for speech, music, and sound effect generation. Our framework introduces a general representation of audio, called "language of audio" (LOA). Any audio can be translated into LOA based on AudioMAE, a self-supervised pre-trained representation learning model. In the generation process, we translate any modalities into LOA by using a GPT-2 model, and we perform self-supervised audio generation learning with a latent diffusion model conditioned on LOA. The proposed framework naturally brings advantages such as in-context learning abilities and reusable self-supervised pretrained AudioMAE and latent diffusion models. Experiments on the major benchmarks of text-to-audio, text-to-music, and text-to-speech demonstrate new state-of-the-art or competitive performance to previous approaches.
- RealityKit Overview - #XRdeveloper.apple.com RealityKit Overview - Augmented Reality - Apple Developer
Use the Reality Composer app and RealityKit to build animations and interactions in iOS and macOS to enrich your 3D content.
- Tiny Stories #AIhuggingface.co TinyStories Candle Wasm Magic - a Hugging Face Space by radames
Discover amazing ML apps made by the community
- DEFCON 31 - Snoop Unto Them, As They Snoop Unto Usblog.dataparty.xyz DEFCON 31 - Snoop unto them, as they snoop unto us
The official videos from DEFCON 31 have been posted! Below you can watch our talk “Snoop unto them as they snoop unto you”. The talk, slides, files
- PhyMask: Robust Sensing of Brain Activity and Physiological Signals During Sleep with an All-textile Eye Mask - #Sense, #Fabric
Clinical-grade wearable sleep monitoring is a challenging problem since it requires concurrently monitoring brain activity, eye movement, muscle activity, cardio-respiratory features, and gross body movements. This requires multiple sensors to be worn at different locations as well as uncomfortable adhesives and discrete electronic components to be placed on the head. As a result, existing wearables either compromise comfort or compromise accuracy in tracking sleep variables. We propose PhyMask, an all-textile sleep monitoring solution that is practical and comfortable for continuous use and that acquires all signals of interest to sleep solely using comfortable textile sensors placed on the head. We show that PhyMask can be used to accurately measure all the signals required for precise sleep stage tracking and to extract advanced sleep markers such as spindles and K-complexes robustly in the real-world setting. We validate PhyMask against polysomnography (PSG) and show that it significantly outperforms two commercially-available sleep tracking wearables—Fitbit and Oura Ring.
- DisPad: Flexible On-Body Displacement of Fabric Sensors for Robust Joint-Motion Tracking - #Sense, #AI, #Fabric
The last few decades have witnessed an emerging trend of wearable soft sensors; however, there are important signal-processing challenges for soft sensors that still limit their practical deployment. They are error-prone when displaced, resulting in significant deviations from their ideal sensor output. In this work, we propose a novel prototype that integrates an elbow pad with a sparse network of soft sensors. Our prototype is fully bio-compatible, stretchable, and wearable. We develop a learning-based method to predict the elbow orientation angle and achieve an average tracking error of 9.82 degrees for single-user multi-motion experiments. With transfer learning, our method achieves the average tracking errors of 10.98 degrees and 11.81 degrees across different motion types and users, respectively. Our core contributions lie in a solution that realizes robust and stable human joint motion tracking across different device displacements.
- #XR - DeepMix: mobility-aware, lightweight, and hybrid 3D object detection for headsets
Mobile headsets should be capable of understanding 3D physical environments to offer a truly immersive experience for augmented/mixed reality (AR/MR). However, their small form-factor and limited computation resources make it extremely challenging to execute in real-time 3D vision algorithms, which are known to be more compute-intensive than their 2D counterparts. In this paper, we propose DeepMix, a mobility-aware, lightweight, and hybrid 3D object detection framework for improving the user experience of AR/MR on mobile headsets. Motivated by our analysis and evaluation of state-of-the-art 3D object detection models, DeepMix intelligently combines edge-assisted 2D object detection and novel, on-device 3D bounding box estimations that leverage depth data captured by headsets. This leads to low end-to-end latency and significantly boosts detection accuracy in mobile scenarios. A unique feature of DeepMix is that it fully exploits the mobility of headsets to fine-tune detection results and boost detection accuracy. To the best of our knowledge, DeepMix is the first 3D object detection that achieves 30 FPS (i.e., an end-to-end latency much lower than the 100 ms stringent requirement of interactive AR/MR). We implement a prototype of DeepMix on Microsoft HoloLens and evaluate its performance via both extensive controlled experiments and a user study with 30+ participants. DeepMix not only improves detection accuracy by 9.1--37.3% but also reduces end-to-end latency by 2.68--9.15×, compared to the baseline that uses existing 3D object detection models.
- #Sense - m3Track: mm wave-based multi-user 3D posture tracking
Nowadays, the market of 3D human posture tracking has extended to a broad range of application scenarios. As current mainstream solutions, vision-based posture tracking systems suffer from privacy leakage concerns and depend on lighting conditions. Towards more privacy-preserving and robust tracking manner, recent works have exploited commodity radio frequency signals to realize 3D human posture tracking. However, these studies cannot handle the case where multiple users are in the same space. In this paper, we present a <u>mm</u>Wave-based <u>m</u>ulti-user 3D posture tracking system, m3Track, which leverages a single commercial off-the-shelf (COTS) mmWave radar to track multiple users' postures simultaneously as they move, walk, or sit. Based on the sensing signals from a mmWave radar in multi-user scenarios, m3Track first separates all the users on mmWave signals. Then, m3Track extracts shape and motion features of each user, and reconstructs 3D human posture for each user through a designed deep learning model. Furthermore. m3Track maps the reconstructed 3D postures of all users into 3D space, and tracks users' positions through a coordinate-corrected tracking method, realizing practical multi-user 3D posture tracking with a COTS mmWave radar. Experiments conducted in real-world multi-user scenarios validate the accuracy and robustness of m3Track on multi-user 3D posture tracking.
- CoDL: efficient CPU-GPU co-execution for deep learning inference on mobile devices - #Sense, #AI
Concurrent inference execution on heterogeneous processors is critical to improve the performance of increasingly heavy deep learning (DL) models. However, available inference frameworks can only use one processor at a time, or hardly achieve speedup by concurrent execution compared to using one processor. This is due to the challenges to 1) reduce data sharing overhead, and 2) properly partition each operator between processors. By solving the challenges, we propose CoDL, a concurrent DL inference framework for the CPU and GPU on mobile devices. It can fully utilize the heterogeneous processors to accelerate each operator of a model. It integrates two novel techniques: 1) hybrid-type-friendly data sharing, which allows each processor to use its efficient data type for inference. To reduce data sharing overhead, we also propose hybrid-dimension partitioning and operator chain methods; 2) non-linearity- and concurrency-aware latency prediction, which can direct proper operator partitioning by building an extremely light-weight but accurate latency predictor for different processors. Based on the two techniques, we build the end-to-end CoDL inference framework, and evaluate it on different DL models. The results show up to 4.93× speedup and 62.3% energy saving compared with the state-of-the-art concurrent execution system.
- Predicting Tap Locations on Touch Screens in the Field Using Accelerometer and Gyroscope Sensor Readingslink.springer.com Predicting Tap Locations on Touch Screens in the Field Using Accelerometer and Gyroscope Sensor Readings
Research has shown that the location of touch screen taps on modern smartphones and tablet computers can be identified based on sensor recordings from the device’s accelerometer and gyroscope. This security threat implies that an attacker could launch a...
Research has shown that the location of touch screen taps on modern smartphones and tablet computers can be identified based on sensor recordings from the device’s accelerometer and gyroscope. This security threat implies that an attacker could launch a background process on the mobile device and send the motion sensor readings to a third party vendor for further analysis. Even though the location inference is a non-trivial task requiring machine learning algorithms in order to predict the tap location, previous research was able to show that PINs and passwords of users could be successfully obtained. However, as the tap location inference was only shown for taps generated in a controlled setting not reflecting the environment users naturally engage with their smartphones, the attempts in this paper bridge this gap. We propose TapSensing, a data acquisition system designed to collect touch screen tap event information with corresponding accelerometer and gyroscope readings. Having performed a data acquisition study with 27 participants and 3 different iPhone models, a total of 25,000 labeled taps could be acquired from a laboratory and field environment enabling a direct comparison of both settings. The overall findings show that tap location inference is generally possible for data acquired in the field, hence, with a performance reduction of approximately 20% when comparing both environments. As the tap inference has therefore been shown for a more realistic data set, this work shows that smartphone motion sensors could potentially be used to comprise the user’s privacy in any surrounding user’s interact with the devices.
- FabToys: Plush Toys with Large Arrays of Fabric-based Pressure Sensors to Enable Fine-grained Interaction Detection
https://doi.org/10.1145/3498361.3538931 In this work, we propose FabToy, a plush toy instrumented with a 24-sensor array of fabric-based pressure sensors located beneath the surface of the toy to have dense spatial sensing coverage while maintaining the natural feel of fabric and softness of the toy. We optimize both the hardware and software pipeline to reduce overall power consumption while achieving high accuracy in detecting a wide range of interactions at different regions of the toy. Our con- tributions include a) sensor array fabrication to maximize coverage and dynamic range, b) data acquisition and triggering methods to minimize the cost of sampling a large number of channels, and c) neural network models with early exit to optimize power consumed for computation when processing locally and autoencoder-based channel aggregation to optimize power consumed for communi- cation when processing remotely. We demonstrate that we can achieve high accuracy of more than 83% for robustly detecting and localizing complex human interactions such as swiping, patting, holding, and tickling in different regions of the toy.
- Tracking Sleep With a Self-Powering Smart Pillow - Neuroscience News - #Senseneurosciencenews.com Tracking Sleep With a Self-Powering Smart Pillow - Neuroscience News
A newly designed "smart pillow" that tracks the position of the head during sleep could help to track and monitor sleep quality and duration in those with sleep disorders.
Real article: https://pubs.acs.org/doi/10.1021/acsami.2c03056
- Sensing in Soft Robotics - #Sense
Some relevant information for elastomers, tensile strength, shore hardness, etc.
- Stable Audio: Fast Timing-Conditioned Latent Audio Diffusion — Stability AI - #AI, #Audiostability.ai Stable Audio: Fast Timing-Conditioned Latent Audio Diffusion — Stability AI
Stable Audio represents the cutting-edge audio generation research by Stability AI’s generative audio research lab, Harmonai. We continue to improve our model architectures, datasets, and training procedures to improve output quality, controllability, inference speed, and output length.
- ControlNet – Achieving Superior Image Generation Results - #AIlearnopencv.com ControlNet – Achieving Superior Image Generation Results
In this article, we explore the ControlNet models for image generation which give uses more ControlNet while generating images for a particular pose or style.