Five to Watch: Key Growth Five to Watch: Areas for ...

Five to Watch: Key Growth Areas for Computer Vision and Pattern Recognition

computer.org


Executive Summary ..................................................................................................................................... 2

Security & Biometrics ................................................................................................................................. 5

Image & Video Synthesis .......................................................................................................................... 8

3D Computer Vision ................................................................................................................................... 10

Representation Learning ....................................................................................................................... 12

Improving Model Efficiency ................................................................................................................ 15

Conclusion ....................................................................................................................................................... 18

Table of Contents

1 | Five to Watch: Key Growth Areas for Computer Vision and Pattern Recognition computer.org

2 | Five to Watch: Key Growth Areas for Computer Vision and Pattern Recognition

In the scientific and research communities, when one discusses an advancement in computer vision or pattern recognition, they think of the Computer Vision and Pattern Recognition (CVPR) conference. But with increased attention and demand for artificial intelligence (AI) and machine learning (ML) technologies, the conference has grown in prestige from one that supported a specific group of research to one that has international renown for the technical developments facilitated in these areas.

In fact, according to 2021 conference program chair David A. Forsyth, University of Illinois Urbana-Champaign, “[CVPR] is indubitably the most influential conference in science right now.” And the Guide2Research placed CVPR at the top of all conferences in computer science in terms of its influence.1

That explains why CVPR 2021 received more than 7,000 submissions, from which 1,600 were selected to be presented in 12 sessions. In addition, 140 papers were considered for awards, based on recommendations from area chairs about the strength of the work.

With that level of influence on research, CVPR trends have become indicators of hot topics for the broader AI and ML community, and this year’s conference gave rise to new emphasis on five important subject areas.

Executive Summary

1. Security & Biometrics A tangible example of AI and ML in the wild, security and biometric applications continue to climb as the technology becomes more mainstream and aligned with solving for issues as diverse as deepfakes and border accessibility.

3. 3D Computer VisionThe market for 3D was valued at USD 1.13 billion in 2019 and is expected to grow at a compound annual growth rate (CAGR) of 14.7% from 2020 to 2027.2 This industry-based emphasis suggests that research supporting 3D Computer Vision will continue to be a hotbed of activity; in fact, this topic represented the largest number of submissions for CVPR 2021.

2. Image & Video Synthesis Industry heavyweight companies have a vested interest in improvements in these areas, and as such, research skews in that direction in support of both novel approaches and applicable concepts.

computer.org


While the conference covers extensively the wide scope of topics affecting the computer vision community, these rose to the surface as being the ones to watch now. From increased submissions to broad accolades, the papers and sessions that addressed these topics offered cutting-edge research and developments on which to build future research. Because, by definition, creating opportunities for the future is the chief indicator of a topic to watch.

4. Representation Learning In creating new approaches to more effectively and efficiently extract features from unlabeled data, the scientific community refines and supports more effective AI and ML systems.

5. Improving Model Efficiency While researchers continue to make models more efficient and thus more accessible, companies struggle with internal methodologies to support AI/ML systems. Thus, this topic will be a focus of both the research and applied communities for the foreseeable future.

computer.org



$327.5 billion: Forecast worldwide revenues for the artificial intelligence (AI) market this year.3

By 2024, the AI market is expected to break

$500 billion.

52%35%

A Growing Demand

of companies who are in growth mode will look to AI to help with workplace disruption.8

of companies plan to implement AI-enabled conversational systems by 2025.7

54% of companies plan to implement AI approaches to process automation within the next few years.6

86% of CEOs say that AI will be a “mainstream technology” for them in 2021.5

1. Security & Biometrics

2. Image & Video Synthesis

3. 3D Computer Vision

4. Representation Learning

5. Improving Model Efficiency

4

computer.org

As computer vision and pattern recognition moves from research to application, security and biometrics play a significant role, begetting both industry implementation and deeper academic analysis. In fact, CVPR 2021 received more papers on this topic than anticipated, leading to a deep exploration of challenges and opportunities in this space.

First, consider the security of the ML configurations themselves. Adversarial ML has risen in prominence as the use of computer vision techniques for facial recognition and other biometric security mechanisms have become more accessible and acceptable as forms of identification.

By studying adversarial ML, leaders can address some of these vulnerabilities, pointing to ways to safeguard the systems. At the workshop Adversarial Machine Learning in Real-World Computer Vision Systems and Online Challenges, experts questioned ways in which they may thwart such systematic challenges. In fact, the paper, “On the Benefits of Defining Vicinal Distributions in Latent Space,” which received the distinction of best paper among the workshop entries, proposed a new vicinal distribution/sampling technique called VarMixup (Variational Mixup) to “sample better Mixup images during training to induce robustness as well as improve predictive uncertainty of models.” The work the group from IIT Hyderabad, India, completed yielded positive results, causing them to conclude, “VarMixup trained models are more robust to common input corruptions and are better calibrated,” offering steps toward shoring up exposures.10

On the other side of the coin come the advances in the biometric approaches themselves. The CVPR 2021 Biometrics Workshop featured a keynote speaker on this topic, exploring the implications for border control surveillance. Christoph Busch of the Department of Information Security and Communication Technology (IIK) at the Norwegian University of Science and Technology (NTNU), spoke to the issues that arise when electronic Machine Readable Travel Documents (eMRTD) are used as a primary form of biometrics for border applications. Because many countries develop these eMRTDs based on a photo submitted by the applicant, the source data may be manipulated or morphed to conceal identity.

1. Security & Biometrics A tangible example of AI and ML in the wild, security and biometric applications continue to climb as the technology becomes more mainstream and aligned with solving for issues as diverse as deepfakes and border accessibility.


“If we’re going to use computer vision for security purposes, it really has to be resilient and accurate and have high integrity,” said Diana Kelley, co-founder, and CTO of Security Curve, during IEEE CS’ Computer Vision Meets Security Panel at CVPR 2021. “But no matter what we’re using computer vision for, if we don’t secure it, it’s going to be vulnerable to failure modes and attacks.” 9

computer.org

The discussion explored ways in which to detect these morphed facial images, including using an Identity Prior Driven Generative Adversarial Network. Busch issued a call to the community to continue working on ways to better identify and prevent these face morphing attacks.11

Other sessions explored ways to deepen facial recognition techniques, including face alignment and pose augmentation. For example, the award-winning paper from the IEEE CS’ Workshop on Analysis and Modeling of Faces and Gestures, “EVA-GCN: Head Pose Estimation Based on Graph Convolutional Networks,” introduced the concept of “Graph Convolutional Networks (GCN) to model the complex nonlinear


mappings between the graph typologies and the head pose angles.” 12 Given the growing significance of head pose estimation with its use in many applications such as virtual reality, driving assistance, and drawbacks of current landmark- based methods for prediction, finding more accurate approaches has become important. Using a scene from the movie Roman Holiday, the researchers from the Institute of Automation at the Chinese Academy of Sciences; Carnegie Mellon University; and Beihang University, accurately detected head motion and achieved better performance “in comparison with the state-of-the-art landmark-based and landmark-free methods.”

The facial recognition market will grow from an estimated 3.8 billion U.S. dollars in 2020, to 8.5 billion U.S. dollars by 2025. 13

computer.org


The Ethics DebateEthics has become a critical component to any conversation about AI and ML and its applications. Data privacy and consumer protections have become major topics. The ethical treatment of those digital footprints, as well as their interpretations and the ruling out of bias, have been major focal points for the computer vision community.

During the Responsible Computer Vision Workshop at CVPR 2021, researchers presented the latest developments in managing to these ethical expectations. Miranda Bogen of Facebook may have summed up the sentiment best:

“I urge you to take a holistic approach when it comes to responsible computer vision. Don’t just focus on the implementation. Don’t just focus on the product goals and whether the context is appropriate. Think about all of them, think about how they all work together and how they interact. If this field doesn’t take these multidimensional and multidisciplinary issues into account early and consider the policy and social and potentially legal context, we’ll continue to see computer vision applications deployed that unfairly and potentially harmfully impact populations.” 14

To that point, the work done by the industry to date continues to need support and guidance. Analyst firm Cognilytica evaluated over 60 existing ethical AI frameworks produced by governmental organizations, corporations, multinational groups, non-profits, and other groups. Their findings did not leave glowing reviews of the current landscape: Government frameworks were most lacking in responsible AI use; corporate frameworks were missing elements of system transparency; and multinational organizations seemed most concerned with regulation, certification, and third-party oversight.15

“Some surprising insights that we found were that a lot of multinational and corporate efforts beat out country efforts,” says Kathleen Walch, managing partner, principal analyst, Cognilytica. “Countries aren’t really paying enough attention to responsible AI uses… Best-in-class AI methodologies, such as the Cognitive Project Management for AI (CPMAI) methodology, which is advocated by Cognilytica, require the establishment of firm ethical AI guidelines in its very first phase… and we’ve found that surprisingly more people than you’d think are not actually taking a very good look at this.”

With so much at stake, ethics will continue to be a significant focus for the computer vision community and a topic of future conversations at CVPR. For more information on this year’s workshop, visit:

https://sites.google.com/view/rcv-cvpr2021.

computer.org

mappings between the graph typologies and the head pose angles.” 12

Given the growing significance of head pose estimation with its use in many applications such as virtual reality, driving assistance, and drawbacks of current landmark- based methods for prediction, finding more accurate approaches has become important. Using a scene from the movie Roman Holiday, the researchers from the Institute of Automation at the Chinese Academy of Sciences; Carnegie Mellon University; and Beihang University, accurately detected head motion and achieved better performance “in comparison with the state-of-the-art landmark-based and landmark-free methods.”

https://sites.google.com/view/rcv-cvpr2021

As another topic to watch, image and video synthesis has emerged as a key area for advancing both the efficiency and applicability of key ML and AI programs.

For one, in the recently published CVPR 2021 best paper award winner, “GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields,” the authors studied how they may introduce more control into photorealistic image synthesis at high resolutions.16 Their work focused on disentangling individual objects and allowed for translating and rotating them in the scene as well as changing the camera pose.

“The question of how to obtain full controllability over the generation process remains unclear,” shared author Michael Niemeyer of the Max Planck Institute for Intelligent Systems, in Tübingen, Germany. “The key idea of this work is to incorporate a compositional 3D scene representation into the generator model.”

The paper demonstrated a level of control over the scene that allowed for manipulations including rotating the object in the scene, translating the object horizontally, changing the depth of the object, shifting the object’s shape, transforming the object’s appearance, and changing the background. If the authors incorporate

supervision as a next step, this work may serve as the basis for more complex, multi-object scenes.

Google Research, too, has spoken of a desire for control of a scene. In the presentation, “NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis,” researchers shared a method that “takes as input a set of images and produces as output a 3D representation that can be rendered from novel viewpoints under arbitrary lighting conditions.”

“NeRV only requires a set of images with known lighting conditions and recovers a 3D representation that can be rendered from novel viewpoints under arbitrary lighting conditions,” summarized researchers during their conference session.17

Beyond the advances in such research, commercial applications continue to pepper the market. For example, in the workshop AI for Content Creation, presentations investigated advances in image and video synthesis, with a focus both on techniques and applications of content creation.18

Virtual reality, videography, gaming, and retail and advertising all benefit from the progress of deep learning and machine learning techniques to automate what once were arduous, timely tasks. These developments have allowed for a host of new possibilities at the intersection of the graphics, the computer vision, and the design communities.


2. Image & Video Synthesis Industry heavyweight companies have a vested interest in improvements in these areas, and as such, research skews in that direction in support of both novel approaches and applicable concepts.

computer.org

In a sponsored session, Microsoft researchers evaluated using synthetic data for human-related computer vision problems, sharing that the fully articulated hand tracking on HoloLens 2, the company’s mixed reality device, was developed using synthetic data alone. The team pointed to three primary reasons to use synthetic data in its development: 1. Clean labels without annotation, noise, or error; 2. ability to generate labels impossible to annotate by hand; and 3. easy control of variation in the dataset. In addition, they were able to demonstrate the granularity of the detail achieved, down to strands of hair in facial imaging.

“Sometimes [synthetic data] is not just as good as real data, but often it is better,” said Tadas Baltrusaitis, senior scientist, Microsoft Research U.K. “Whilst we’ve focused on faces and hands, we still have this dream of creating a human synthetics pipeline that would allow us to render everything together at really high fidelity in one unified system.”19

Analysts expect the focus on synthetic data to continue to grow. According to Forrester Research’s 2021 predictions, “There will be more progress toward trusted data for AI. 2021 will showcase the good, the bad, and the ugly of artificial data,”—including synthetic data. The report also hypothesizes that “blockchain and AI will start joining forces more seriously to support data provenance, integrity, and usage tracking.”20


CVPR 2021 Best Paper

“GIRAFFE:Representing Scenes as Compositional Generative Neural Feature Fields”for controllable content creation in a model that “disentangles individual objects and allows for translating and rotating them in the scene as well as changing the camera pose.”

computer.org

3. 3D Computer VisionThe market for 3D was valued at USD 1.13 billion in 2019 and is expected to grow at a compound annual growth rate (CAGR) of 14.7% from 2020 to 2027.2 This industry-based emphasis suggests that research supporting 3D Computer Vision will continue to be a hotbed of activity; in fact, this topic represented the largest number of submissions for CVPR 2021.

3D computer vision has grown in research interest and exploration this year, and this topic—which includes purely geometric methods to 3D point clouds to 3D from a single image and beyond—had the largest number of submis-sions at CVPR 2021. The growth may, in part, stem from the pertinence of the research to a wide variety of applications.

CVPR 2021 best paper finalist, “Diffusion Probabilistic Models for 3D Point Cloud Generation,” presented a probabilistic model for point cloud generation.21 The team, made up of researchers from Peking University in Beijing, China, emphasized that advances in in-depth sensing and laser scanning have made point clouds a popular representation for modeling 3D shapes. With their work, they proposed, “modeling the reverse diffusion process for point clouds as a Markov chain conditioned on certain shape latent.” Their results achieved competitive performance in point cloud generation and auto-encoding.

“Our model is inspired by the diffusion process in nonequilibrium thermodynamics,” said presenter Wei Hu. “We consider the reverse of the diffusion process. The reverse process does the reverse thing. It starts from a chaotic set of points and recovers them to a specific shape. The reverse model needs to learn from data and from how to recover the point set.”22

Another best paper finalist, “Point2Skeleton: Learning Skeletal Representations from Point Clouds,” introduced a novel unsupervised method for generating skeletal meshes from point clouds by “using the insights of the medial axis transform (MAT) to capture the intrinsic geometric and topological natures of the original input points.”23

The results of the researchers’ work are promising, showing “the skeletal mesh generated by [their] method effectively captures the underlying structures for general 3D shapes, even when represented as point clouds with noises or missing parts.” The research team, from The University of Hong Kong in China; University College London in the U.K., and Texas A&M University in College Station, Texas, U.S.A., see promise for this work in examining how to combine the geometric and topological properties of the skeletal mesh and higher-level supervision from humans (e.g., semantics) for 3D learning tasks.

10 | Five to Watch: Key Growth Areas for Computer Vision and Pattern Recognition computer.org

In addition to these noted papers, a session from Amazon Consumer Science gave insights into how the team is working to bring 3D computer vision to the Amazon customer.24 In his presentation, Frederic Devernay, senior applied scientist, imaging technology, shared that their goal is to build 3D models for all products in the Amazon catalog. A second presentation from Michael Liu, senior applied science manager, visual search, discussed visual search on Amazon, the camera inside the app that helps a customer take a picture of a product and find it on Amazon—and the next iteration of that specific approach.

“We believe the future of visual search is to be a complementary text search,” said Liu. “So, we’ve been working with the Amazon search team to see how we can directly turn an image into a search query and fuse behavioral data into the search results.”


The global 3D machine vision market size was valued at USD

1.13 billion in 2019 and is expected to grow at a compound annual growth rate (CAGR) of

14.7% from 2020 to 2027.


Aleix Martinez, senior applied science manager, home innovation, described how Amazon Home has been employing a tremendous amount of research using transformers to support this work, including recent work to incorporate functionality from Liu’s area.

“This is a really hard problem for machine learning. We are trying to simulate what an expert interior designer, a human, does,” concluded Martinez. “We’re working with other groups to add this into StyleSnap [the camera feature], so you can actually take a picture of your own room and then we can give you suggestion of collections for that.”

In a different type of applied approach of 3D computer vision, the paper “Birds of a Feather: Capturing Avian Shape Models From Images,” points to the difficulties in building a deformable shape model for a new species.25 The lack of 3D data makes it difficult, but in their work, PhD students and researchers from the University of Pennsylvania, Philadelphia, U.S.A., present a new method for capturing shape models of novel species from image sets. The paper concludes that by using a low-dimen-sional embedding, the “learned 3D shape space better reflects the phylogenetic relationships among birds than learned perceptual features.”

26

computer.org

Similarly, scientists from The Chinese University of Hong Kong noted positive developments with unsupervised representation learning models. In “Jigsaw Clustering for Unsupervised Visual Representation Learning,” the research team was able to introduce a method which negates the need for multiple batches during training and “opens the door for future research of single-batch unsupervised methods.” The group achieved strong results, outperforming existing models.

Then, in what was CVPR 2021’s best student paper, “Task Programming: Learning Data Efficient Behavior Representations,” researchers proved they could gain similar performance to specialized domain knowledge with fewer hours and training annotations, making their approach more cost-effective and efficient. The team worked to unite the self-supervised learning framework with the weak supervision/programmatic supervision framework through task programming. The main idea was to develop programs that can “oversee” the work versus human supervision, which is helpful for behavioral analysis experiments.

In addition to this emphasis on an applied research, the community continues to see climbs in fundamental science, with representation learning as one focal point of study. In fact, CVPR 2021 organizers noted that the conference hosted an uptick of presentations on representation learning, including award-recognized papers. When leveraging representation learning approaches, researchers seek to improve model performance by “extracting features from unlabeled data by training a neural network on a secondary learning task.”27 Through this approach, representation learning holds promise for expanding areas of study—including movement into unsupervised learning models.

Case in point, in “Exploring Simple Siamese Representation Learning,” a best paper honorable mention from CVPR 2021, researchers demonstrated that simple Siamese networks can learn meaningful representations. In fact, their work demonstrates that “a stop-gradient operation plays an essential role in preventing collapsing,” resulting in a method that achieves competitive results on ImageNet and downstream tasks.28


“We show that a simple Siamese network with a stop gradient and optionally a predictor multi-layer perceptron is sufficient for unsupervised representation learning,” said Xinlei Chen, Facebook AI Research. “Through experiments, we find the stop gradient operation is crucial. Without it, the representation collapses.” 29

4. Representation Learning In creating new approaches to more effectively and efficiently extract features from unlabeled data, the scientific community refines and supports more effective AI and ML systems.

computer.org

Similarly, scientists from The Chinese University of Hong Kong noted positive developments with unsupervised representation learning models. In “Jigsaw Clustering for Unsupervised Visual Representation Learning,” the research team was able to introduce a method which negates the need for multiple batches during training and “opens the door for future research of single-batch unsupervised methods.”30 The group achieved strong results, outperforming existing models.

Then, in what was CVPR 2021’s best student paper, “Task Programming: Learning Data Efficient Behavior Representations,” researchers proved they could gain similar performance to specialized domain knowledge with fewer hours and training annotations, making their approach more cost-effective and efficient.31 The team worked to unite the self-supervised learning framework with the weak supervision/programmatic supervision framework through task programming. The main idea was to develop programs that can “oversee” the work versus human supervision, which is helpful for behavioral analysis experiments.

“Large amounts of behavioral data are recorded in many domains, from autonomous vehicles to sports analytics to biology and video games,” said Jennifer Sun, Caltech. “Using our framework, experts can trade off a large amount of annotations for a small number of tasks.”32

This effort yielded results: the Caltech model reduced annotation burden by up to a factor of 10 without compromising accuracy compared to state-of-the-art features.



CVPR 2021 Best Paper Honorable Mention

“Exploring Simple Siamese Representation Learning” for “TREBA: a method to learn annotation- sample efficient trajectory embedding for behavior analysis, based on multi-task self-supervised learning.”

computer.org


Computer Vision for Health and MedicineHealthcare continues to be one of the top vertical markets to benefit from AI and ML approaches. According to Grand View Research, the global artificial intelligence in healthcare market size was valued at USD 6.7 billion in 2020 and is expected to expand at a compound annual growth rate (CAGR) of 41.8% from 2021 to 2028.33

Part of this growth can be attributed to the impact of COVID-19. From virus detection and diagnosis to treatment protocols and preventative measures, AI played a critical role in analyzing massive datasets to pinpoint solutions. And experts anticipate that level of intensity to only escalate for future disease diagnosis and drug discovery.

This year’s IEEE CVPR 2021 Medical Computer Vision Workshop explored some of these concepts, focusing in on developments in computer vision in the medical arena, including the below, among others.34

And these examples scratch the surface of the work being pursued at the intersection of computer vision and medicine. For more information on the sessions, visit the workshop website at:

https://sites.google.com/view/cvprmcv21.

“MD + Machine: Reimagining Surgical Oncology in the Age of Learning Models”

Parvin Mousavi of Queen’s University, Canada, discussed the evolution of cancer surgery with data science influences. One strong use is in determining more accurate margins for the surgeon in the removal of a cancerous lesion or tumor.

“Values of AI-based Medical Images—from a Clinician Point of View”

Dorothy Tzu-Chen Yen of Chang Gung University, Taiwan, focused on how AI support of medical imaging can strengthen diagnosis, with one of the biggest hurdles being overcoming concerns about AI-based medical images.

“nnU-Net: Automated Design of Deep Learning Methods for Biomedical Image Segmentation”

Paul F. Jaeger from the German Cancer Research Center focused on the impact their nnU-Net approach is having on image segmentation, noting that at MICCAI 2020 (International Conference on Medical Image Computing and Computer Assisted Intervention), nine out of 10 segmentation challenge winners based their method on nnU-Net.

computer.org

Similarly, scientists from The Chinese University of Hong Kong noted positive developments with unsupervised representation learning models. In “Jigsaw Clustering for Unsupervised Visual Representation Learning,” the research team was able to introduce a method which negates the need for multiple batches during training and “opens the door for future research of single-batch unsupervised methods.” The group achieved strong results, outperforming existing models.

Then, in what was CVPR 2021’s best student paper, “Task Programming: Learning Data Efficient Behavior Representations,” researchers proved they could gain similar performance to specialized domain knowledge with fewer hours and training annotations, making their approach more cost-effective and efficient. The team worked to unite the self-supervised learning framework with the weak supervision/programmatic supervision framework through task programming. The main idea was to develop programs that can “oversee” the work versus human supervision, which is helpful for behavioral analysis experiments.

Representation learning models led into a different area of focus for the AI/ML research community: efficiency. In fact, CVPR 2021 saw an increase in papers on this topic. That may be due to the fact that by becoming more efficient, AI/ML technologies become more accessible to institutions of all sizes and can then be leveraged more widely.

The CVPR paper, “Improving the Efficiency and Robustness of Deepfakes Detection Through Precise Geometric Features,” emphasized the security implications of increasing model efficiency. The work focused on how to better detect deepfakes to prevent copyright infringement, information confusion, or even public panic. In this paper, researchers from Shanghai Jiao Tong University in Shanghai, China, introduced a framework—which they named LRNet—for detecting deepfake videos through “temporal modeling on precise geometric features.” Their work found that “integration of facial landmarks and temporal features can be a fast and robust test of deepfakes… facial geometric information and its dynamic characteristics are worth exploring.”35

Then, the paper, “Group Whitening: Balancing Learning Efficiency and Representational Capacity,” assesses group whitening to “exploit the advantages of the whitening operation and avoid the disadvantages of normalization within mini-batches.” The work, conducted by a group of researchers from Beihang University, Beijing, China; Southeast University, China; and Inception Institute of Artificial Intelligence (IIAI), Abu Dhabi, UAE, concludes that group whitening “consistently improves the performance of different architectures.”36

Yet, while research to ensure more efficient, effective models makes AI and ML solutions more broadly accessible and reliable, the Catch 22 comes in company readiness to


employ these solutions. Market experts reiterate the importance of having set foundational best practices prior to integrating more complex solutions into their models.

“Companies that have already adopted AI at their organization, they’ve already run a few projects, maybe they’re very well on the AI adoption journey, they’re always looking to improve model efficiency, but for a vast majority of companies, they’re nowhere near ready for [focusing on model efficiency],” said Kathleen Walch, managing partner, principal analyst, at industry firm Cognilytica.37 “So, I think from a research perspective, great. They are really pushing the boundaries; they’re really figuring out what they can do. But for most companies, they’re not there yet.”

Walch emphasized the need for companies to have instituted appropriate AI methodologies to address these projects internally, prior to working toward greater efficiencies. Similarly, companies who have purchased AI solutions need to ensure those vendors are using methodologies aligned with industry best practices; they also need to allow those same solution providers to make necessary upgrades to introduce efficiencies into the workstream.

“As companies are continuing to look to adopt artificial intelligence and machine learning in their organization, they really need to focus on how to run AI projects,” Walch noted. “We’ve started to see some companies slowly realize, ‘Ok, we need to adopt a methodology,’ but this is going to be a big area of focus that more and more companies are going to need to adopt if they would like to have successful AI projects.”

5. Improving Model Efficiency While researchers continue to make models more efficient and thus more accessible, companies struggle with internal methodologies to support AI/ML systems. Thus, this topic will be a focus of both the research and applied communities for the foreseeable future.

computer.org


Leading Applications

Intel“Intel, EXOR International, TIM and JMA Wireless teamed together to build an end-to-end smart factory in Verona, Italy, as an example of the benefits of Industry 4.0 digitalization to manufacturers of all sizes.

Manufacturers are evaluating ways to take advantage of industrial Internet of Things (IIoT) technologies such as artificial intelligence (AI) and 5G to reduce maintenance and energy costs and improve workforce productivity… EXOR’s smart factory aims to demonstrate the operational benefits of digitalization, including:

• Autonomous human resources scheduling, reacting to changes in orders and employee availability in real time • A clear indicator of whether everything planned for the week, including supplies, components and documentation, is in order and ready for production. • Real-time updates on order status and work-in-progress advancements, regardless of order size.”

– Press release, “Paving the Way for Smart Factories”38

While fundamental research in computer vision continues to soar, so too, do market-based applications, as witnessed by announcements from CVPR 2021 Sponsors.

Kitware

“Kitware, Inc., leaders in deep learning for computer vision, has been selected as a prime contractor on the Defense Advanced Research Projects Agency (DARPA) Invisible Headlights program. The Kitware team will address the technical challenges associated with developing passive 3D sensors for stealthy, nighttime, autonomous driving in defense applications.”

– Press release, “Kitware to Develop AI Solution for Stealthy Autonomous Driving”39

computer.org


NVIDIA“A typical self-driving vehicle can use dozens of DNNs for perception, localization and path planning.

However training neural networks requires a hefty amount of processing power, which is why Tesla built its supercomputer using powerful Nvidia GPUs.

Tesla’s supercomputer uses 720 nodes of 8x NVIDIA A100 Tensor Core GPUs (5,760 GPUs total) to achieve an unparalleled 1.8 exaflops of performance, making it one of the world’s most powerful computers. This kind of processing power is mind boggling.”

– FutureCar, “A Closer Look at Tesla’s Nvidia-powered Supercomputer for Training Deep Neural Networks”40

OPPO

“Leading global smartphone brand OPPO recently took part in the premier annual computer vision event Computer Vision and Pattern Recognition Conference (CVPR) 2021. During the conference, OPPO’s achievements in AI were recognized with its placing in six major Challenges in eleven different contests in total demonstrating the company’s industry-leading technological strengths and innovative break-throughs in AI.”

– Manila Bulletin, “Global smartphone brand’s achievements in AI recognized at the Computer Vision and Pattern Recognition Conference 2021”41

For continued industry adoption of computer vision technologies, advanced training will be necessary. In fact, a study from analyst firm Cognilytica concludes the biggest barrier to AI adoption is insufficient quantity or quality of data, followed by limited availability of AI talent and skills.

computer.org

42


Conclusion

A special thank youto CVPR 2021 Champion Sponsors.

Although five key topics emerged this year, it stands to reason that computer vision and pattern recognition will continue to influence research and development efforts and create new areas of focus for years to come. Advances in AI and ML mean the technologies will become more widespread and mainstream. As they do, increases in applications will drive demands for enhanced efficiency, new data analysis, and ongoing ethical deliberations.

Projections of growth in the AI and ML market reiterate the ways this industry will continue to shift and evolve, and along with it, the research efforts supported by the scientific community. With so much changing in the industry, the “five to watch,” is sure to grow and shift over the next year. But as the industry evolves to meet emerging challenges and opportunities, CVPR will continue to respond with problem-solving scientific explorations of technical topics and applicable solutions to support the designs of the research community and the industry as a whole.

Save the date for CVPR 2022, slated to take place in New Orleans, La., U.S.A., from June 19 – 24. And mark your calendars for the paper submission deadline on November 16, 2021. For more information visit http://cvpr2022.thecvf.com/.

About the IEEE Computer Society

The IEEE Computer Society is the world’s home for computer science, engineering, and technology. A global leader in providing access to computer science research, analysis, and information, the IEEE Computer Society offers a comprehensive array of unmatched products, services, and opportunities for individuals at all stages of their professional career. Known as the premier organization that empowers the people who drive technology, the IEEE Computer Society offers international conferences, peer-reviewed publications, a unique digital library, and training programs.

Visit computer.org for more information.

http://cvpr2022.thecvf.com/

http://computer.org


References

1 – https://www.eventscribeapp.com/live/videoPlayer.asp?lsfp=cW5QR1o1VHlyVnF3cEFnZWJjVzI0dE5TQjlhVWx2UWJYVkpLUGhXNStmVT0=

2 – https://www.grandviewresearch.com/industry-analysis/3d-machine-vision-market

3 – https://www.idc.com/getdoc.jsp?containerId=prUS47482321

4 – https://www.idc.com/getdoc.jsp?containerId=prUS47482321

5 – https://www.pwc.com/us/en/tech-effect/ai-analytics/ai-predictions.html

6 – https://www.cognilytica.com/document/global-ai-adoption-trends-forecast-2020/

7 – https://www.cognilytica.com/document/global-ai-adoption-trends-forecast-2020/

8 – https://go.forrester.com/blogs/predictions-2021-the-time-is-now-for-ai-to-shine/

9 – https://www.eventscribeapp.com/live/videoPlayer.asp?lsfp=cW5QR1o1VHlyVnF3cEFnZWJjVzI0a2xXYlJGa1E1SzR6eWlwdHpJUWhzYz0=

10 – https://aisecure-workshop.github.io/amlcvpr2021/cr/2.pdf

11 – https://christoph-busch.de/projects-mad.html; https://www.vislab.ucr.edu/B-AMFG2021/invited_talks.php

12 – https://openaccess.thecvf.com/content/CVPR2021W/AMFG/papers/Xin_EVA-GCN_Head_Pose_Estimation_Based_on_Graph_Convolutional_Networks_CVPRW_2021_paper.pdf

13 – https://www.statista.com/statistics/1153970/worldwide-facial-recognition-revenue/

14 – https://youtu.be/zK9DNM_HgnQ?list=PLW7hfeqHGkJoRKCENo-1nFLKX84WEkEFa

15 – https://www.cognilytica.com/2021/04/29/state-of-ethical-ai-frameworks

16 – https://openaccess.thecvf.com/content/CVPR2021/html/Niemeyer_GIRAFFE_Representing_Scenes_As_Compositional_Generative_Neural_Feature_Fields_CVPR_2021_paper.html

17 – https://www.eventscribeapp.com/live/videoPlayer.asp?lsfp=cW5QR1o1VHlyVnF3cEFnZWJjVzI0czg2WE9CK2ZUVWc2bkc3NXh0RWRxUT0=

18 – http://visual.cs.brown.edu/workshops/aicc2021/

19 – https://www.eventscribe.net/2021/2021CVPR/SearchByBucket.asp?f=CustomPresfield11&bm=Sponsor%20Sessions&pfp=SponsorSessions

20 – https://go.forrester.com/blogs/predictions-2021-the-time-is-now-for-ai-to-shine/

21 – https://openaccess.thecvf.com/content/CVPR2021/html/Luo_Diffusion_Probabilistic_Models_for_3D_Point_Cloud_Generation_CVPR_2021_paper.html

22 – https://www.eventscribeapp.com/live/videoPlayer.asp?lsfp=cW5QR1o1VHlyVnF3cEFnZWJjVzI0cGhrd1ozYTIvZy91cGlhdEZMUU4yWEVXY2RGS3kxWitKbnIwQnlVTkxkaA==

23 – https://openaccess.thecvf.com/content/CVPR2021/html/Lin_Point2Skeleton_Learning_Skeletal_Representations_from_Point_Clouds_CVPR_2021_paper.html

24 – https://www.eventscribeapp.com/live/videoPlayer.asp?lsfp=cW5QR1o1VHlyVnF3cEFnZWJjVzI0cCtUdzI1Snp2cjIzejBmVDFFQW84QT0=

25 – https://openaccess.thecvf.com/content/CVPR2021/html/Wang_Birds_of_a_Feather_Capturing_Avian_Shape_Models_From_Images_CVPR_2021_paper.html

26 – https://www.grandviewresearch.com/industry-analysis/3d-machine-vision-market

27 – https://neptune.ai/blog/understanding-representation-learning-with-autoencoder-everything-you-need-to-know-about-representation-and-feature-learning

28 – https://openaccess.thecvf.com/content/CVPR2021/html/Chen_Exploring_Simple_Siamese_Representation_Learning_CVPR_2021_paper.html

29 – https://www.eventscribeapp.com/live/videoPlayer.asp?lsfp=cW5QR1o1VHlyVnF3cEFnZWJjVzI0clV4QUJwOUJjZVpGWFNiVHVLaUtnL1FyTi9nRlErTFZiK01XeGNKbnJzKw==

30 – https://openaccess.thecvf.com/content/CVPR2021/html/Chen_Jigsaw_Clustering_for_Unsupervised_Visual_Representation_Learning_CVPR_2021_paper.html

31 – https://openaccess.thecvf.com/content/CVPR2021/html/Sun_Task_Programming_Learning_Data_Efficient_Behavior_Representations_CVPR_2021_paper.html

32 – https://www.eventscribeapp.com/live/videoPlayer.asp?lsfp=cW5QR1o1VHlyVnF3cEFnZWJjVzI0bGhqZlBPUmRtd3hNMFkrTGpzTFJNN1hWT0FuQ2g1WkNmMTNxVXBkc2M5bA==

33 – https://www.grandviewresearch.com/industry-analysis/artificial-intelligence-ai-healthcare-market

34 – https://sites.google.com/view/cvprmcv21#h.ju64jvcuj2kd

35 – https://openaccess.thecvf.com/content/CVPR2021/papers/Sun_Improving_the_Efficiency_and_Robustness_of_Deepfakes_Detection_Through_Precise_CVPR_2021_paper.pdf

36 – https://openaccess.thecvf.com/content/CVPR2021/papers/Huang_Group_Whitening_Balancing_Learning_Efficiency_and_Representational_Capacity_CVPR_2021_paper.pdf

37 – https://www.cognilytica.com/about/analysts-staff/; quotes taken from a recorded interview

38 - https://www.intel.com/content/www/us/en/newsroom/news/paving-way-smart-factories.html

39 - https://blog.kitware.com/kitware-to-develop-ai-solution-for-stealthy-autonomous-driving/

40 - https://www.futurecar.com/4698/A-Closer-Look-at-Teslas-Nvidia-powered-Supercomputer-for-Training-Deep-Neural-Networks

41 - https://mb.com.ph/2021/07/07/global-smartphone-brands-achievements-in-ai-recognized-at-the-computer-vision-and-pattern-recognition-conference-2021/

42 - https://www.cognilytica.com/document/global-ai-adoption-trends-forecast-2020/

computer.org

https://www.eventscribeapp.com/live/videoPlayer.asp?lsfp=cW5QR1o1VHlyVnF3cEFnZWJjVzI0dE5TQjlhVWx2UWJYVkpLUGhXNStmVT0=

https://www.grandviewresearch.com/industry-analysis/3d-machine-vision-market

https://www.idc.com/getdoc.jsp?containerId=prUS47482321

https://www.idc.com/getdoc.jsp?containerId=prUS47482321

https://www.pwc.com/us/en/tech-effect/ai-analytics/ai-predictions.html

https://www.cognilytica.com/document/global-ai-adoption-trends-forecast-2020/


https://go.forrester.com/blogs/predictions-2021-the-time-is-now-for-ai-to-shine/

https://www.eventscribeapp.com/live/videoPlayer.asp?lsfp=cW5QR1o1VHlyVnF3cEFnZWJjVzI0a2xXYlJGa1E1SzR6eWlwdHpJUWhzYz0=

https://aisecure-workshop.github.io/amlcvpr2021/cr/2.pdf

https://christoph-busch.de/projects-mad.html; https://www.vislab.ucr.edu/B-AMFG2021/invited_talks.php

https://openaccess.thecvf.com/content/CVPR2021W/AMFG/papers/Xin_EVA-GCN_Head_Pose_Estimation_Based_on_Graph_Convolutional_Networks_CVPRW_2021_paper.pdf

https://www.statista.com/statistics/1153970/worldwide-facial-recognition-revenue/

https://youtu.be/zK9DNM_HgnQ?list=PLW7hfeqHGkJoRKCENo-1nFLKX84WEkEFa

https://www.cognilytica.com/2021/04/29/state-of-ethical-ai-frameworks

https://openaccess.thecvf.com/content/CVPR2021/html/Niemeyer_GIRAFFE_Representing_Scenes_As_Compositional_Generative_Neural_Feature_Fields_CVPR_2021_paper.html

https://www.eventscribeapp.com/live/videoPlayer.asp?lsfp=cW5QR1o1VHlyVnF3cEFnZWJjVzI0czg2WE9CK2ZUVWc2bkc3NXh0RWRxUT0=

http://visual.cs.brown.edu/workshops/aicc2021/

https://www.eventscribe.net/2021/2021CVPR/SearchByBucket.asp?f=CustomPresfield11&bm=Sponsor%20Sessions&pfp=SponsorSessions

https://go.forrester.com/blogs/predictions-2021-the-time-is-now-for-ai-to-shine/

https://openaccess.thecvf.com/content/CVPR2021/html/Luo_Diffusion_Probabilistic_Models_for_3D_Point_Cloud_Generation_CVPR_2021_paper.html

https://www.eventscribeapp.com/live/videoPlayer.asp?lsfp=cW5QR1o1VHlyVnF3cEFnZWJjVzI0cGhrd1ozYTIvZy91cGlhdEZMUU4yWEVXY2RGS3kxWitKbnIwQnlVTkxkaA==

https://openaccess.thecvf.com/content/CVPR2021/html/Lin_Point2Skeleton_Learning_Skeletal_Representations_from_Point_Clouds_CVPR_2021_paper.html

https://www.eventscribeapp.com/live/videoPlayer.asp?lsfp=cW5QR1o1VHlyVnF3cEFnZWJjVzI0cCtUdzI1Snp2cjIzejBmVDFFQW84QT0=

https://openaccess.thecvf.com/content/CVPR2021/html/Wang_Birds_of_a_Feather_Capturing_Avian_Shape_Models_From_Images_CVPR_2021_paper.html

https://www.grandviewresearch.com/industry-analysis/3d-machine-vision-market

https://neptune.ai/blog/understanding-representation-learning-with-autoencoder-everything-you-need-to-know-about-representation-and-feature-learning

https://openaccess.thecvf.com/content/CVPR2021/html/Chen_Exploring_Simple_Siamese_Representation_Learning_CVPR_2021_paper.html

https://www.eventscribeapp.com/live/videoPlayer.asp?lsfp=cW5QR1o1VHlyVnF3cEFnZWJjVzI0clV4QUJwOUJjZVpGWFNiVHVLaUtnL1FyTi9nRlErTFZiK01XeGNKbnJzKw==

https://openaccess.thecvf.com/content/CVPR2021/html/Chen_Jigsaw_Clustering_for_Unsupervised_Visual_Representation_Learning_CVPR_2021_paper.html

https://openaccess.thecvf.com/content/CVPR2021/html/Sun_Task_Programming_Learning_Data_Efficient_Behavior_Representations_CVPR_2021_paper.html

https://www.eventscribeapp.com/live/videoPlayer.asp?lsfp=cW5QR1o1VHlyVnF3cEFnZWJjVzI0bGhqZlBPUmRtd3hNMFkrTGpzTFJNN1hWT0FuQ2g1WkNmMTNxVXBkc2M5bA==

https://www.grandviewresearch.com/industry-analysis/artificial-intelligence-ai-healthcare-market

https://sites.google.com/view/cvprmcv21#h.ju64jvcuj2kd

https://openaccess.thecvf.com/content/CVPR2021/papers/Sun_Improving_the_Efficiency_and_Robustness_of_Deepfakes_Detection_Through_Precise_CVPR_2021_paper.pdf

https://openaccess.thecvf.com/content/CVPR2021/papers/Huang_Group_Whitening_Balancing_Learning_Efficiency_and_Representational_Capacity_CVPR_2021_paper.pdf

https://www.cognilytica.com/about/analysts-staff/; quotes taken from a recorded interview

https://www.intel.com/content/www/us/en/newsroom/news/paving-way-smart-factories.html

https://blog.kitware.com/kitware-to-develop-ai-solution-for-stealthy-autonomous-driving/

https://www.futurecar.com/4698/A-Closer-Look-at-Teslas-Nvidia-powered-Supercomputer-for-Training-Deep-Neural-Networks

https://mb.com.ph/2021/07/07/global-smartphone-brands-achievements-in-ai-recognized-at-the-computer-vision-and-pattern-recognition-conference-2021/


Date post:	18-Dec-2021
Category:	Documents
Upload:	others
View:	3 times
Download:	0 times