Try : Insurtech, Application Development










Dev Ops(2)

Enterprise Solution(20)




AI in Insurance(26)


Product Innovation(34)


Augmented Reality(7)

Customer Journey(7)


User Experience(21)

Artificial Intelligence(95)



Cognitive Computing(7)

Computer Vision(6)

Data Science(14)


Intelligent Automation(25)

Machine Learning(43)

Natural Language Processing(10)




Telehealth Care(1)

Is AI replacing Architects?

Architecture is perhaps the most complex discipline operating in more dimensions than any other coordinated human activity. However with the advancement of artificial intelligence, like every other profession, architects to are worried about the level of automation that has already taken away specific tasks from their roles.

While the ‘Humans are hooked and Machines are learning’, AI and ML are disrupting all manner of industries. Although AI has taken decades to go from crazy lab demos to a finished consumer product — today, there are immense possibilities for the industry to be augmented and enhanced by artificial intelligence. 

The earliest sense of advancement in the construction field came with Building Information Modelling (BIM) — a term that has existed since the 1970s, but came to its penultimate fore in the early 2000s, when Autodesk began popularizing the tag. 

The resulting by-product was the BIM software which is a type of intelligent 3D-modelling process used by architecture, engineering, and construction (AEC) practitioners to design and construct any kind of infrastructure. BIM software includes computer-aided design (CAD software) tools and libraries specifically targeted toward architectural design and construction and goes beyond traditional drawings to generate a fully digital model. 

Over several years the BIM (Building Information Modelling) software has had a huge influence on the day-to-day operations undertaken in an architectural firm

The Parametric design or the programming architecture can scrape through several design styles in no time and can come up with a perfect Zaha style building plan — that would otherwise take years to be designed. 

Over the last few decades, BIM has transformed the roles of engineers, contractors, architects, developers, and consultants by allowing them to communicate the same language and collaborate better. It has quite literally revolutionized both the design process itself and the designs themselves. 

BIM software produces an immense volume of big data, so much so that most architecture firms and their consulting partners don’t know what to do with them. Once AI permeated the technological landscape and bled over into every imaginable business use case — the industry learned to create value by collecting, organizing and storing building-related data (collected from models, simulations, etc.) It is now widely believed, that the scope for innovating the most optimal designs for each construction project becomes completely conceivable.

AI BIM = Optimized [Affinity]

When ‘parametric design’ technology is combined with AI that can actually use 6D BIM-models, and can record the whole life cycle of the building — it can come up with better decisions and insights into project execution by learning from the mistakes of the past.

Today, there are machines that can run through an infinite number of datasets, simulate for each model, pick the best option, verify its efficiency and continue to learn and communicate when introduced with the new autonomous building technology.

AI is the next frontier for architecture
Changes in the demographics, technology and business models have opened up a plethora of far-reaching opportunities for architects to explore areas like urban housing in more ecosystems than ever before.

Let’s have a look at some architectural products augmented and enhanced by AI.

Road Printers
The six meters wide machine that can pave entire streets at once. Naturally, the stones fall on the road directly into the appropriate pattern. The device is simple to handle and can finish the work in no time.

Concrete 3D Printers
3D printing as a core method to fabricate buildings or construction components. At a construction scale, it will have a wide variety of applications within the private, commercial, industrial and public sectors. The concrete 3D printers enable faster construction, lower labor costs, increased accuracy, greater integration of function and less waste produced.

Brick Laying Machine
The bot can lay between 300 to 400 bricks an hour, compared to a human which can only lay around 60 to 75 bricks an hour. It works 5 times faster than a human and can alleviate the labor shortage.

Brick Laying & 3D Printing Concrete Drone
Though in its infancy, researchers from Imperial College London have taken the first step towards making this a reality with their work on a drone that is able to ‘3D print’ while it is in flight.

However efficient bots may be, it will always lag in understanding the personality and the character of the customer — and this is where humans intervene.

Architects with the help of AI can create something different from the one-size-fits-all range of products already in the marketplace, to create more personalized solutions that perfectly align with user needs — but it is the imperfections in our creative decisions that truly makes something personal and truly unique.

What is your opinion about AI in architecture? Do you think AI will either augment or eliminate every profession in the near future?

Let us know by commenting.

To know us in person, reach us on hello@mantralabsglobal.com  


Knowledge thats worth delivered in your inbox

Tabular Data Extraction from Invoice Documents

5 minutes, 12 seconds read

The task of extracting information from tables is a long-running problem statement in the world of machine learning and image processing. Although the latest accomplishments in the field of deep learning have seen a lot of success, tabular data extraction still remains a challenge due to the vast amount of ways in which tables are represented both visually and structurally. Below are some of the examples: 

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Invoice Documents

Many companies process their bills in the form of invoices which contain tables that hold information about the items along with their prices and quantities. This information is generally required to be stored in databases while these invoices get processed.

Traditionally, this information is required to be hand filled into a database software however, this approach has some drawbacks:

1. The whole process is time consuming.

2. Certain errors might get induced during the data entry process.

3. Extra cost of manual data entry.

 An invoice automation system can be deployed to address these shortcomings. The idea is to upload the invoice document and the system will read and generate the tabular information in the digital format making the whole process faster and more cost-effective for companies.

Fig. 6

Fig. 6 shows a sample invoice that contains some regular invoice details such as Invoice No, Invoice Date, Company details, and two tables holding transaction information. Now, our goal is to extract the information present in the two tables.

Tabular Information

The problem of extracting tables from invoices can be condensed into 2 main subtasks.

1. Table Detection

2. Tabular Structure Extraction.

 What is Table Detection?

 Table Detection is the process of identifying and locating tables that are present in a document, usually an image. There are multiple ways to detect tables in an image. Some of the approaches make use of image processing toolkits like OpenCV while some of the other approaches use statistical models on features extracted from the documents such as Text Position and Text Characteristics. Recently more deep learning approaches have been used to detect tables using trained neural networks similar to the ones used in Object Detection.

What is Table Structure Extraction?

Table Structure Extraction is the process of extracting the tabular information once the boundaries of the table are detected through Table Detection. The information within the rows and columns is then extracted and transferred to the desired format, usually CSV or Excel file.

Table Detection using Faster RCNN

Faster RCNN is a neural network model that comes from the RCNN family. It is the successor of Fast RCNN created by Ross Girshick in 2015. The name Faster RCNN is to signify an improvement over the previous model both in terms of training speed and detection speed. 

To read more about the model framework, one can access the paper Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.

 There are many other object detection model architectures that are available for use today. Each model comes with certain advantages and disadvantages in terms of prediction accuracy, model parameter size, inference speed, etc.

For the task of detecting tables in invoice documents, we will select the Faster RCNN model with FPN(Feature Pyramid Network) as a feature extraction network. The model is pre-trained on the ImageNet corpus using ResNET 101 architecture. The ImageNet corpus is a public dataset that consists of more than 20,000 image categories of everyday objects.  We will therefore make use of a Pytorch framework to train and test the model.

The above mentioned model gives us a fast inference time and a high Mean Average Precision. It is preferred for cases where a quick real time detection is desired.

First, the model is to be trained using public datasets for Table Detection such as Marmot and UNLV datasets. Next, we further fine-tune the model with our custom labeled dataset. For the purpose of labeling, we will follow the COCO annotation format.

Once trained, the model displayed an accuracy close to 86% on our custom dataset. There are certain scenarios where the model fails to locate the tables such as cases containing watermarks and/or overlapping texts. Tables without borders are also missed in a few instances. However, the model has shown its ability to learn from examples and detect tables in multiple different invoice documents. 

Fig. 7

After running inference on the sample invoice from Fig 6, we can see two table boundaries being detected by the model in Fig 7. The first table gets detected with 100% accuracy and the second table is detected with 99% accuracy.

Table Structure Extraction

Once the boundaries of the table are detected by the model, an OCR (Optical Character Reader) mechanism is used to extract the text within the boundaries. The text is then processed using the information that is part of a unique table.

We were able to extract the correct structure of the table, including its headers and line items using logics derived from the invoices. The difficulty of this process depends on the type of invoice format at hand.

There are multiple challenges that one may encounter while building an algorithm to extract structure. Some of them are:

  1. The span of some table columns may overlap making it difficult to determine the boundaries between columns.
  2. The fonts and sizes present within tables may vary from one table to another. The algorithm should be able to accomodate for this variation.
  3. The tables might get split into two pages and detecting the continuation of a table might be challenging.

Certain deep learning approaches have also been published recently to determine the structure of a table. However, training them on custom datasets still remains a challenge. 

Fig 8

The final result is then stored in a CSV file and can be edited or stored according to one’s convenience as shown in Fig 8 which displays the first table information.


The deep learning approach to extracting information from structured documents is a step in the right direction. With high accuracy and low running time, the systems can only learn to perform better with more data. The recent and upcoming advancements in computer vision approaches have made processes such as invoice automation significantly accessible and robust.

About the author:

Prateek Sethi is a Data Scientist working at Mantra Labs. His work involves leveraging Artificial Intelligence to create data-driven solutions. Apart from his work he takes a keen interest in football and exploring the outdoors.

Further Reading:


Knowledge thats worth delivered in your inbox

Loading More Posts ...
Go Top

May i help you?

bot shadow

Our Website is
Best Experienced on
Chrome & Safari

safari icon