AI Resources (Under Development)
Sections
Significant Areas-Current Task List - Tutorials - Applications - Prominent Papers
Multi view based 3D models
- Neural Fields Rendering System:
Requirements:
- Store objects as neural network
- It could be a single neural network for each object
- multiple obejcts encoded in the same network using embedding as input.
- Store embeedding for each voxel and a common neural network. A generalized marching cubes algorithm.
- A hybrid progressive representation
- A global embedding for overall structure and progressive embedding for successive voxel breaking to refine the shape to increased quality
- DeepLS, SDF (It has a different embedding compared to NSVF). It tried to learn NN to learn smaller edges. Heirarchical NeRF -> NSVF
- Can transformer networks be used instead of NN?
- Learning material properties: Nerf Issues
- Looks like there is a solution which uses Albedo and Illumination in CVPR. Environment maps are used to replicate theillumination problem
- Light could be used as a dot product i.e. a NN which takes the direction of light, direction of incident ray and material properties at that point and produces the output at that point
- Store objects as neural network
Current Task List
- Modify NSVF - embedding and voxel culling
- Modify NeRF - to support SDF experiments, add support of embedding for individual shape using DeepSDF idea
List of ideas:
-
SDF helps in physics based simulations i.e. they help in contact detection between objects and hence in object object touching etc - so may help in hair, cloth simulations
- Experiment with Occupancy and SDF variations
-
Can we split objects into separate objects which and then join them together later [Graffe paper]. Create a system which can align objects as different poses and translation and camera pose
-
Use Nerf++ to encode forground and background object
-
NeRF for Human shapes: Fashion related human modeling design and animation.
- Garment Hair 3D geometry and charater animation.
- Nerf setup for hair and cloth simulation. Interpreenetation of hair strands into head and sholder
-
Learn a prior using NeRF and learn embeddings using a single input image [PiFU].
- Used trained NN and extract embeddings based on input image. Heirarchical NerF -> NSVF
-
Can Embedding be learnt for different parts like face, hands clothers. Can we use Pose as input - CPIGAN
- Using PSNR or SSIM to identify areas in the learnt model which are not properly learnt and using this guide the rays to be used for learning.
- Create a weigting map of the areas which are lacking detail
- Sample rays based on this weighting map within a picture
- Sample rays based on weighting map of all the pictures
- Split voxels adaptively based on the PSNR and SSIM distribution
- Progressively learning model using smaller images and gradually increasing the image dimension as we increase the model accuracy. The aim is to reduce the training time of the 3D model.
- Estimating the bounding box automatically based on the camera position
- Use images from smaller size along widthxheight probably rations 2,4,8
- Select a small subset of camera postions to start with and gradually increase with iterations
- Allocate initial voxel size based on all the camera postions and interpixel ray distance and then as the image dimensions increase the voxel size would decrease accordingly
- Culling Voxels - This reduces the number voxels used for representing the object. This helps in reduction of the memory .Voxels in the bounding box which are not touched by the rays. Two major regions
- Voxels inside the object which are not reached by the rays
- Voxels outside the object region which are not touched by the training rays
- Using PSNR or SSIM to identify areas in the learnt model which are not properly learnt and using this guide the rays to be used for learning.
-
NeRF in Motion: Encoding motion for objects in a neural scene. There are diffferent ideas for it.
- Datasets: D-NeRF: Neural Radiance Fields for Dynamic Scenes - Extends the dataset of NERF to dynamics, RigNET, People Snapshot Dataset - Video Based Reconstruction of 3D People Models, BUFF: Bodies Under Flowing Fashion - Detailed, accurate, human shape estimation from clothed {3D} scan sequences, PiFuHD - renderpeople, HDRI Haven, cape, code, utils
- Using normalized coordinate system i.e. map actual values to normalised value to then learn a warping function which adds on to it and then render it f(X(x), Y(y), Z(z)). Maps (x,y,z) for each time step. Learn how bones are mapped to mesh pixels. Find the transformation function) - Motion Capture based rendering system
- NSVF uses Hyper networks to encode every network encode for each time step. SRN
- Like mesh objects, can Bones, rigging and weighting be added for the objects thereby making it configurable Bone structure, Blog,
-
Embedding or Latent variable to control different aspects of a 3D generation
- Shape is encoded as a latent vector and then a shape with the (xyz) is used for predicting shape - ShaRF: Shape-conditioned Radiance Fields from a Single View,
- Conditioned on the image the generate embedding for each voxel which could be used as input along with xyz - pixelNeRF: Neural Radiance Fields from One or Few Images - Code
-
Relighting of models:
-
Few Shot Learning:
- Learning priors using bayesian neural networks: Uncertainity Quantification, Weight Uncertainity in NN, Variational Inference in Bayesian NN - The idea is to train priors of shapes using neural networks and then try to learn the representation using a single image input. The prior stores range of uncertainity in the variance of the weights and tries to find the right instance value using a single image input. We try to replace a NN in NeRF to a Baysian NN and try to model the uncertainity in shape as the weights of the bayesian-neural-networks
- Understanding GPT3: Paper, GPT2 Blog, GPT2 Paper, Youtube, Youtube2,
- Transformers for 3d model - Perspective Transformer Nets, Set Transformer, Spatial Transformer for 3D Point Clouds
- Possible continous learning
-
Active learning for 3d object reconstruction
-
Structure from motion using deep learning: Turntable
- Can NeRF be modeled to run without the camera parameters? Since we are modelling the neural network as a funtioin of x,y,z can we learn using SGD the model without the camera parameters?
- Fix one camera position. Model the 3D model as function of relative camera parrameter. Try minimising the error of images while we learn the error from different projections.
-
Splitting light, view and time and directly rendering 2D images - X-field - paper, code
-
Compress 3D shape representation by using entries from a codebook like the idea in VQVAE - https://arxiv.org/pdf/1711.00937.pdf,
-
Depth/Height map as implicit functions :
- Using a base template and encoding the shape as a structure over the template - Learning Shape Templates with Structured Implicit Functions, code
- Human models: Converting 3d model to 2d surface plane - developable surface SMPL
- Fitting code
- Understanding SMPL, SMPLX
- Transfering between SMPL-SMPLX-SMPLH-FLAME-MANO
- SMPL - Papers, project - SMPLX - paper, supp, project
- Face Fitting - Flame fitting, FLAME
- Approximating 3d shapes with blobs/ellipsoids - project, Local Deep Implicit Functions for 3D Shape, Learning Shape Templates with Structured Implicit Functions
-
Online 3d models - Dataset - Turbosquid, sketchfab, Pixologic, Zbrush
-
Synching two cameras - libsoftwaresync - Code list:
Nerf Universe:
-
Basic Nerf
-
Ability to choose appropriate texture and material properties based on BRDF
Prominent Papers
-
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis: paper, code, mesh reconstruction, color reproduction, PyMcubes
-
Neural Sparse Voxel Fields: paper
-
Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains: paper
-
Generating Diverse High-Fidelity Images with VQ-VAE-2: paper
-
NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections: paper
-
Neural Rendering: project
-
GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields
-
- Pixelwise View Selection for Unstructured Multi-View Stereo
- COLMAP Documentation - Dense Reconstruction - multi-view - FAQ
- Structure From Motion - RANSAC, RANSAC1, five-point relative pose problem
- tutorial
- undistort images - making lines straight - usually lens causes distortion - camera calibration
- patchmatch stereo algorithm - Stereo Matching, wiki, code,
- stereo_fusion algorithms - slide
- poisson mesh reconstruction - slide
-
Computer Vision Course 2019 - 13 Alignment, 14 Calibration, 15 Single View, 16 Epipolar, 17 SFM, 18 Stereo, 19 Multiview stereo
-
Nerf-W - Nerf in the wild
-
Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations
-
Deformed Implicit Field: Modeling 3D Shapes with Learned Dense Correspondence
-
SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static Images
-
SDFDiff: Differentiable Rendering of Signed Distance Fields for 3D Shape Optimization - code
-
Understanding illumination: Differenctiable renderer idea, Two-shot Spatially-varying BRDF and Shape Estimation code
-
Deep Learning Papers: NeRD: Neural Reflectance Decomposition from Image Collections - project, NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis - project
-
Tutorial on Spherical Gaussians - SG SERIES PART 1-6, Spherical Harmonic Lighting: The Gritty Details, Cook Torrence Model, Environment mapping Slides
-
A Reflectance Model for Computer Graphics, Bidirectional reflectance distribution function - BRDF - Phong Refelection model - Cook–Torrance model, Physically-Based Shading at Disney, Disney BRDF Base color Metallic parametrization, An Efficient Representation for Irradiance Environment Maps, On the Relationship between Radiance and Irradiance: Determining the illumination from images of a convex Lambertian object, Physically based rendering (PBR), OpenGL PBR,
-
Material Editing - Photorealistic Material Editing Through Direct Image Manipulation, Gaussian Material Synthesis
-
Not Read, Probably good:
- Neural Reflectance Fields for Appearance Acquisition - author
- All-Frequency Rendering of Dynamic, Spatially-Varying Reflectance, tutorial slide - tutorials on using Spherical Gaussians for BRDF and Environment illumination - Portrait Neural Radiance Fields from a Single Image
-
-
Online 3D model assets:
- TurboSquid - Using professional models. This is a good reference
- Environmental Maps - HDRI Haven
-
Depth Estimation from input image: MiDaS, code,latest code - Perpetual View Generation - Infinite Nature
-
Local Deep Implicit Functions for 3D Shape - code, paper - Learning a transformation matrix to approximate a gaussian function to define an implicit function. Learning Shape Templates with Structured Implicit Functions - Can we use this to approximate shapes
-
Understanding SDF based operations - ray marching and sdf - article1, article2, SDF renderiing, Sphere Tracing
-
canonical coordinates 3d model, quaternion representation of rotation
-
Adding secondary motion - Complementary Dynamics
-
Not read: Learning to Recover 3D Scene Shape from a Single Image, Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations, Object-Centric Neural Scene Rendering, computer assisted prog, signed-distance-SRN, differentiable_volumetric_rendering,
-
Quaternions - ref1, ref2, SE(3) transform
-
NeRF−−: Neural Radiance Fields Without Known Camera Parameters
-
Companies - Teriflix, Searching for artist, paperboatstudios, studiodurga
AI Sculptor: Assuming the rays as a nail, we like to sculpt a 3D model based on multiple input views. We use the NSVF as our based code. The list of problems we are planning to attack: - Marching cubes would fail to recover the surface since the function learns only the boundary and does not learn inside the object. So a different learning algorithm is required for learning the 3D mesh object. We need to find a better algorithm to extract the 3D mesh and surface information. - A soluttion would be to shoot rays from different directions and choose only those points which have normals which are parallel to the ray of intersection. Use these points as an input to creat a point cloud and then convert this point cloud to a mesh using pointcloud2mesh algorithms, Surface Reconstruction - Alternative representation: Signed Distance field representation has better representation of objects than transparency representation. Can we use SDF instead of transparency? Define a rendering function using SDF. Use this function instead of transparency based rendering function and then use it to represent Neural scene DeepSDF, DeepLS, Papier-Machˆe, Occupancy Networks - Heirarchical representation: Nerf/DeepSDF use NN to represent the whole scene. NSVF/DeepLS use local embedding information to represent shapes in voxels. Can a heirarchchical representation of latent variables be used to represnet the whole shape to get consistent representation. - Can the shape information used as code in DeepSDF be used in NERF/NSVF kind of setup to reduce the number of images to encode a scene directly from a siingle image - Physics simulation using NeRF: Real Time Fluid simulation, Learning to simulate
List of questions
- multiresolution in signed distance function, how are multiple objects composed using SDF, What is the maximum capacity of NN?, occupancy vs sdf comparison
- a KD-tree - nearest neighbor algorithm -
- distance transform for any watertight shapes
- Weight normalization: A simple reparameterization to accelerate training of deep neural networks. - speed up training instead of batch normalization
- H.P.: Multi-level partition of unity implicits. - Splitting a volume into smaller regions and trying to generate to whole shape based on sum of smaller regions - weighting function is 1 when you are within the region - https://www.cc.gatech.edu/~turk/my_papers/mpu_implicits.pdf - https://en.wikipedia.org/wiki/Partition_of_unity
- datasets: https://3dwarehouse.sketchup.com/, http://graphics.stanford.edu/data/3Dscanrep/
Other queries
- AI Economist
- DALL·E: Creating Images from Text
- Transformers are Graph Neural Networks
- Immersive Light Field Video with a Layered Mesh Representation
- EFFICIENT CONTINUAL LEARNING WITH MODULAR NETWORKS AND TASK-DRIVEN PRIORS
- IIRC: Incremental Implicitly-Refined Classification
- A algorithm to solve generic problems: C-Space Tunnel Discovery for Puzzle Path Planning, More Puzzles, paper - Concentrate on the different planning algorithms. The baisc idea would be to find tunnels (i.e. narrow points which could possibly leead to a solution). Then have a duble tree starting from Start and reversely from goal. This could be done for every tunnel point. This stage is the blooming stage. You could approximate the value of the tree using neural networks. Try finding links from the bloomed tree and the successive tunnel paths. IF we find a route find the optimal route from start to end - Probabilistic Roadmap Path Planning PRM Planner, PRM1, RDT-Based Methods - dual-tree RDT algorithm
- Interlinked SPH Pressure Solvers for Strong Fluid-Rigid Coupling,
Significant Areas
- Bayesian Optimzation based GAN latent variable search
- Bayesian Neural network based BO
- StyleGAN
- Camera pose estimate
- Fourier feature networks
- Photogrammetry - software
- Part segmentation
- HairNet
- Sculpting
- SMPL
- Image Harmonization
- Pose Transfer
- Multi-view rendering
- HyperNetworks
- Continous Learning - Iman, paper, Review Paper, Tackling catastropic forgetting, Incremental learning
- 3D Machine Learning Github
To Be Read Papers
- Interactive Video StylizationInteractive Video Stylization, Resolution Dependent GAN, Video Completion,Latent Space of Generative Networks, Bayesian Deep Learning for Computer Vision, Real Image Editing GAN, Rewriting a Deep Generative Model
- Softwares: AI Vdo, CineTracer,GPU Sharing,SpyFU, OBS Studio
- Neural Sampling
Current Task List
Tutorials
- GAN basics,
- Running models on GPU
- Framework - Pytorch, Keras, numpy, Pytorch3D
- AlphaZero,
- CPISADGAN
- BO
- MultiView Geometry, Camera Basics, Rotation Matrix, Projective, Affine, Euclidean Geometry, NDC Co-ordinate Systems
Applications
Graphics related
- Omniverse, Signed Distance Field Collision, Specular Manifold Sampling, Detailed Rigid Body Simulation
- Character creation: Reallusion, Adobe Fuse/Mixamo
Sections
Children resources - Videos - Accelerators - Channels followed - GPU and CUDA resource - Projects
Tutorials
Children resources
Videos
- Heroes of Deep Learning: Andrew Ng interviews Pieter Abbeel
- Deepmind opensources starcraft II ai
- OpenAI bots beats DOTA Worl champion
- Deepmind and UCL course work
List of topics to be covered
Good resources
- Sutton Book: Book Draft, Code And Examples
- Deep Mind: RL Course by David Silver - First of a 10 lecture series - http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html
Accelerators
Job Options
Internships
- Studio list - Pixar, nickelodean, laika, dreamworks, bluesky, warner brothers, cartoon network, Blizzard
- How to get a pixar internship
- Animation colleges - conceptdesignacad
- How to get a job at Blizzard
Portfolio creation
- DevianArt, Instagram, YouTube
- List of inspiring artists: bobby pontillas, rafael grassetti human anatomy, youtube, Laura Price MY ART JOURNEY, Michael Vicente, Ross draws
- Polycount, zbrush central, artstation
- Books - Composing Pictures, illusion of life, Animators survival kit
Channels followed
GPU and CUDA resource
-
AWS:
- p2 - NVIDIA Tesla K80 - 2 x 2496
- G3 - NVIDIA Tesla M60 - 4096 NVIDIA CUDA® cores (2048 per GPU)
- G2 - NVIDIA GRID K520 GPUs - 3072 core
- CG1 - NVIDIA Tesla M2050