AI Resources (Under Development)

Sections

Significant Areas-Current Task List - Tutorials - Applications - Prominent Papers

Multi view based 3D models

Neural Fields Rendering System: Requirements:
- Store objects as neural network
  - It could be a single neural network for each object
  - multiple obejcts encoded in the same network using embedding as input.
  - Store embeedding for each voxel and a common neural network. A generalized marching cubes algorithm.
  - A hybrid progressive representation
    - A global embedding for overall structure and progressive embedding for successive voxel breaking to refine the shape to increased quality
    - DeepLS, SDF (It has a different embedding compared to NSVF). It tried to learn NN to learn smaller edges. Heirarchical NeRF -> NSVF
  - Can transformer networks be used instead of NN?
- Learning material properties: Nerf Issues
  - Looks like there is a solution which uses Albedo and Illumination in CVPR. Environment maps are used to replicate theillumination problem
  - Light could be used as a dot product i.e. a NN which takes the direction of light, direction of incident ray and material properties at that point and produces the output at that point

Current Task List

Modify NSVF - embedding and voxel culling
Modify NeRF - to support SDF experiments, add support of embedding for individual shape using DeepSDF idea

List of ideas:

SDF helps in physics based simulations i.e. they help in contact detection between objects and hence in object object touching etc - so may help in hair, cloth simulations
- Experiment with Occupancy and SDF variations
Can we split objects into separate objects which and then join them together later [Graffe paper]. Create a system which can align objects as different poses and translation and camera pose
Use Nerf++ to encode forground and background object
NeRF for Human shapes: Fashion related human modeling design and animation.
- Garment Hair 3D geometry and charater animation.
- Nerf setup for hair and cloth simulation. Interpreenetation of hair strands into head and sholder
Learn a prior using NeRF and learn embeddings using a single input image [PiFU].
- Used trained NN and extract embeddings based on input image. Heirarchical NerF -> NSVF
Can Embedding be learnt for different parts like face, hands clothers. Can we use Pose as input - CPIGAN
- Using PSNR or SSIM to identify areas in the learnt model which are not properly learnt and using this guide the rays to be used for learning.
  - Create a weigting map of the areas which are lacking detail
  - Sample rays based on this weighting map within a picture
  - Sample rays based on weighting map of all the pictures
  - Split voxels adaptively based on the PSNR and SSIM distribution
- Progressively learning model using smaller images and gradually increasing the image dimension as we increase the model accuracy. The aim is to reduce the training time of the 3D model.
  - Estimating the bounding box automatically based on the camera position
  - Use images from smaller size along widthxheight probably rations 2,4,8
  - Select a small subset of camera postions to start with and gradually increase with iterations
  - Allocate initial voxel size based on all the camera postions and interpixel ray distance and then as the image dimensions increase the voxel size would decrease accordingly
- Culling Voxels - This reduces the number voxels used for representing the object. This helps in reduction of the memory .Voxels in the bounding box which are not touched by the rays. Two major regions
  - Voxels inside the object which are not reached by the rays
  - Voxels outside the object region which are not touched by the training rays
NeRF in Motion: Encoding motion for objects in a neural scene. There are diffferent ideas for it.
- Datasets: D-NeRF: Neural Radiance Fields for Dynamic Scenes - Extends the dataset of NERF to dynamics, RigNET, People Snapshot Dataset - Video Based Reconstruction of 3D People Models, BUFF: Bodies Under Flowing Fashion - Detailed, accurate, human shape estimation from clothed {3D} scan sequences, PiFuHD - renderpeople, HDRI Haven, cape, code, utils
- Using normalized coordinate system i.e. map actual values to normalised value to then learn a warping function which adds on to it and then render it f(X(x), Y(y), Z(z)). Maps (x,y,z) for each time step. Learn how bones are mapped to mesh pixels. Find the transformation function) - Motion Capture based rendering system
- NSVF uses Hyper networks to encode every network encode for each time step. SRN
- Like mesh objects, can Bones, rigging and weighting be added for the objects thereby making it configurable Bone structure, Blog,
Embedding or Latent variable to control different aspects of a 3D generation
- Shape is encoded as a latent vector and then a shape with the (xyz) is used for predicting shape - ShaRF: Shape-conditioned Radiance Fields from a Single View,
- Conditioned on the image the generate embedding for each voxel which could be used as input along with xyz - pixelNeRF: Neural Radiance Fields from One or Few Images - Code
Relighting of models:
- X-Fields: Implicit Neural View-, Light- and Time-Image Interpolation, Deep Relightable Textures, Deferred Neural Rendering: Image Synthesis using Neural Textures
Few Shot Learning:
- Learning priors using bayesian neural networks: Uncertainity Quantification, Weight Uncertainity in NN, Variational Inference in Bayesian NN - The idea is to train priors of shapes using neural networks and then try to learn the representation using a single image input. The prior stores range of uncertainity in the variance of the weights and tries to find the right instance value using a single image input. We try to replace a NN in NeRF to a Baysian NN and try to model the uncertainity in shape as the weights of the bayesian-neural-networks
- Understanding GPT3: Paper, GPT2 Blog, GPT2 Paper, Youtube, Youtube2,
- Transformers for 3d model - Perspective Transformer Nets, Set Transformer, Spatial Transformer for 3D Point Clouds
- Possible continous learning
Active learning for 3d object reconstruction
Structure from motion using deep learning: Turntable
- Can NeRF be modeled to run without the camera parameters? Since we are modelling the neural network as a funtioin of x,y,z can we learn using SGD the model without the camera parameters?
- Fix one camera position. Model the 3D model as function of relative camera parrameter. Try minimising the error of images while we learn the error from different projections.
Splitting light, view and time and directly rendering 2D images - X-field - paper, code
Compress 3D shape representation by using entries from a codebook like the idea in VQVAE - https://arxiv.org/pdf/1711.00937.pdf,
Depth/Height map as implicit functions :
- Using a base template and encoding the shape as a structure over the template - Learning Shape Templates with Structured Implicit Functions, code
- Human models: Converting 3d model to 2d surface plane - developable surface SMPL
  - SMPL to UV mapping, UV map to SMPL model, How to get UV coordinates for the template
- Fitting code
- Understanding SMPL, SMPLX
- Transfering between SMPL-SMPLX-SMPLH-FLAME-MANO
- SMPL - Papers, project - SMPLX - paper, supp, project
- Face Fitting - Flame fitting, FLAME
- Approximating 3d shapes with blobs/ellipsoids - project, Local Deep Implicit Functions for 3D Shape, Learning Shape Templates with Structured Implicit Functions
Online 3d models - Dataset - Turbosquid, sketchfab, Pixologic, Zbrush
Synching two cameras - libsoftwaresync - Code list:
https://github.com/shunsukesaito/PIFu
https://github.com/facebookresearch/pifuhd
https://github.com/kwea123/nerf_pl
https://github.com/facebookresearch/NSVF

Nerf Universe:

Basic Nerf
Ability to choose appropriate texture and material properties based on BRDF

Prominent Papers

Arxiv - Computer Vision
Nerf - citations
NSVF - citations
NeRF Explosion 2020
Awesome NeRF
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis: paper, code, mesh reconstruction, color reproduction, PyMcubes
Neural Sparse Voxel Fields: paper
Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains: paper
Generating Diverse High-Fidelity Images with VQ-VAE-2: paper
NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections: paper
Neural Rendering: project
Nerf++ - paper
iNerf, Nerf–
FroDo
PIFuHD
PIFu
GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields
- GRAF - project - code
Optimizing the Latent Space of Generative Networks
SFM - Structure from motion
- Pixelwise View Selection for Unstructured Multi-View Stereo
- COLMAP Documentation - Dense Reconstruction - multi-view - FAQ
- Structure From Motion - RANSAC, RANSAC1, five-point relative pose problem
- tutorial
- undistort images - making lines straight - usually lens causes distortion - camera calibration
- patchmatch stereo algorithm - Stereo Matching, wiki, code,
- stereo_fusion algorithms - slide
- poisson mesh reconstruction - slide
Computer Vision Course 2019 - 13 Alignment, 14 Calibration, 15 Single View, 16 Epipolar, 17 SFM, 18 Stereo, 19 Multiview stereo
DeepSDF - Code - Supp
Nerf-W - Nerf in the wild
Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations
State of the Art on Neural Rendering
Local Deep Implicit Functions for 3D Shape
Deformed Implicit Field: Modeling 3D Shapes with Learned Dense Correspondence
Stable View Synthesis
- Free View Synthesis - code
SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static Images
MetaSDF: Meta-learning Signed Distance Functions
SDFDiff: Differentiable Rendering of Signed Distance Fields for 3D Shape Optimization - code
- Implicit function
Vladlen Koltun: Towards Photorealism, contact
Understanding illumination: Differenctiable renderer idea, Two-shot Spatially-varying BRDF and Shape Estimation code
- Deep Learning Papers: NeRD: Neural Reflectance Decomposition from Image Collections - project, NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis - project
- Tutorial on Spherical Gaussians - SG SERIES PART 1-6, Spherical Harmonic Lighting: The Gritty Details, Cook Torrence Model, Environment mapping Slides
- A Reflectance Model for Computer Graphics, Bidirectional reflectance distribution function - BRDF - Phong Refelection model - Cook–Torrance model, Physically-Based Shading at Disney, Disney BRDF Base color Metallic parametrization, An Efficient Representation for Irradiance Environment Maps, On the Relationship between Radiance and Irradiance: Determining the illumination from images of a convex Lambertian object, Physically based rendering (PBR), OpenGL PBR,
- Material Editing - Photorealistic Material Editing Through Direct Image Manipulation, Gaussian Material Synthesis
- Not Read, Probably good:
  - Neural Reflectance Fields for Appearance Acquisition - author
  - All-Frequency Rendering of Dynamic, Spatially-Varying Reflectance, tutorial slide - tutorials on using Spherical Gaussians for BRDF and Environment illumination - Portrait Neural Radiance Fields from a Single Image
- Physics based Methods in Vision
- Ligting basics, Video, 3D Texturing
Online 3D model assets:
- TurboSquid - Using professional models. This is a good reference
- Environmental Maps - HDRI Haven
Depth Estimation from input image: MiDaS, code,latest code - Perpetual View Generation - Infinite Nature
- One Shot 3D Photography, code,
Local Deep Implicit Functions for 3D Shape - code, paper - Learning a transformation matrix to approximate a gaussian function to define an implicit function. Learning Shape Templates with Structured Implicit Functions - Can we use this to approximate shapes
Understanding SDF based operations - ray marching and sdf - article1, article2, SDF renderiing, Sphere Tracing
canonical coordinates 3d model, quaternion representation of rotation
Adding secondary motion - Complementary Dynamics
Not read: Learning to Recover 3D Scene Shape from a Single Image, Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations, Object-Centric Neural Scene Rendering, computer assisted prog, signed-distance-SRN, differentiable_volumetric_rendering,
Quaternions - ref1, ref2, SE(3) transform
StyleGAN, StyleFlow
NeRF−−: Neural Radiance Fields Without Known Camera Parameters
Companies - Teriflix, Searching for artist, paperboatstudios, studiodurga

AI Sculptor: Assuming the rays as a nail, we like to sculpt a 3D model based on multiple input views. We use the NSVF as our based code. The list of problems we are planning to attack: - Marching cubes would fail to recover the surface since the function learns only the boundary and does not learn inside the object. So a different learning algorithm is required for learning the 3D mesh object. We need to find a better algorithm to extract the 3D mesh and surface information. - A soluttion would be to shoot rays from different directions and choose only those points which have normals which are parallel to the ray of intersection. Use these points as an input to creat a point cloud and then convert this point cloud to a mesh using pointcloud2mesh algorithms, Surface Reconstruction - Alternative representation: Signed Distance field representation has better representation of objects than transparency representation. Can we use SDF instead of transparency? Define a rendering function using SDF. Use this function instead of transparency based rendering function and then use it to represent Neural scene DeepSDF, DeepLS, Papier-Machˆe, Occupancy Networks - Heirarchical representation: Nerf/DeepSDF use NN to represent the whole scene. NSVF/DeepLS use local embedding information to represent shapes in voxels. Can a heirarchchical representation of latent variables be used to represnet the whole shape to get consistent representation. - Can the shape information used as code in DeepSDF be used in NERF/NSVF kind of setup to reduce the number of images to encode a scene directly from a siingle image - Physics simulation using NeRF: Real Time Fluid simulation, Learning to simulate

List of questions

multiresolution in signed distance function, how are multiple objects composed using SDF, What is the maximum capacity of NN?, occupancy vs sdf comparison
a KD-tree - nearest neighbor algorithm -
distance transform for any watertight shapes
Weight normalization: A simple reparameterization to accelerate training of deep neural networks. - speed up training instead of batch normalization
H.P.: Multi-level partition of unity implicits. - Splitting a volume into smaller regions and trying to generate to whole shape based on sum of smaller regions - weighting function is 1 when you are within the region - https://www.cc.gatech.edu/~turk/my_papers/mpu_implicits.pdf - https://en.wikipedia.org/wiki/Partition_of_unity
datasets: https://3dwarehouse.sketchup.com/, http://graphics.stanford.edu/data/3Dscanrep/

Other queries

AI Economist
DALL·E: Creating Images from Text
Transformers are Graph Neural Networks
Immersive Light Field Video with a Layered Mesh Representation
EFFICIENT CONTINUAL LEARNING WITH MODULAR NETWORKS AND TASK-DRIVEN PRIORS
IIRC: Incremental Implicitly-Refined Classification
A algorithm to solve generic problems: C-Space Tunnel Discovery for Puzzle Path Planning, More Puzzles, paper - Concentrate on the different planning algorithms. The baisc idea would be to find tunnels (i.e. narrow points which could possibly leead to a solution). Then have a duble tree starting from Start and reversely from goal. This could be done for every tunnel point. This stage is the blooming stage. You could approximate the value of the tree using neural networks. Try finding links from the bloomed tree and the successive tunnel paths. IF we find a route find the optimal route from start to end - Probabilistic Roadmap Path Planning PRM Planner, PRM1, RDT-Based Methods - dual-tree RDT algorithm
Interlinked SPH Pressure Solvers for Strong Fluid-Rigid Coupling,

Significant Areas

Bayesian Optimzation based GAN latent variable search
Bayesian Neural network based BO
StyleGAN
Camera pose estimate
Fourier feature networks
Photogrammetry - software
Part segmentation
HairNet
Sculpting
SMPL
Image Harmonization
Pose Transfer
Multi-view rendering
HyperNetworks
Continous Learning - Iman, paper, Review Paper, Tackling catastropic forgetting, Incremental learning
3D Machine Learning Github

To Be Read Papers

Current Task List

Tutorials

GAN basics,
Running models on GPU
Framework - Pytorch, Keras, numpy, Pytorch3D
AlphaZero,
CPISADGAN
BO
MultiView Geometry, Camera Basics, Rotation Matrix, Projective, Affine, Euclidean Geometry, NDC Co-ordinate Systems

Children resources

Videos

List of topics to be covered

Good resources

Sutton Book: Book Draft, Code And Examples
Deep Mind: RL Course by David Silver - First of a 10 lecture series - http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html

Accelerators

Job Options

Internships

Studio list - Pixar, nickelodean, laika, dreamworks, bluesky, warner brothers, cartoon network, Blizzard
How to get a pixar internship
Animation colleges - conceptdesignacad
How to get a job at Blizzard

Portfolio creation

DevianArt, Instagram, YouTube
List of inspiring artists: bobby pontillas, rafael grassetti human anatomy, youtube, Laura Price MY ART JOURNEY, Michael Vicente, Ross draws
Polycount, zbrush central, artstation
Books - Composing Pictures, illusion of life, Animators survival kit

Channels followed

GPU and CUDA resource

AWS:
- p2 - NVIDIA Tesla K80 - 2 x 2496
- G3 - NVIDIA Tesla M60 - 4096 NVIDIA CUDA® cores (2048 per GPU)
- G2 - NVIDIA GRID K520 GPUs - 3072 core
- CG1 - NVIDIA Tesla M2050
NVIDIA Grant
- NVIDIA Quadro M5000 - 2048 [GeForce Titan Xp, Quadro P5000, NVIDIA M5000 specification]
- GTX 1080i - 3584 - 700$
- GTX 1080 - 2,560 - 500$
- GTX 1070 - 1920 - 390
- Titan X pascal - 3584
- Jetson
EGPU: egpu, 9to5mac

AI Resources (Under Development)

Sections

Multi view based 3D models

Current Task List

List of ideas:

Prominent Papers

List of questions

Other queries

Significant Areas

To Be Read Papers

Current Task List

Tutorials

Applications

Graphics related

Sections

Tutorials

Children resources

Videos

List of topics to be covered

Good resources

Accelerators

Job Options

Internships

Portfolio creation

Channels followed

GPU and CUDA resource