How to use OCR's in appian RPA using FALCON

1. We have image file , the bot should read the particular text and extract the same using OCR. How to achieve this?  any samples links ?

2. Does appian RPA Falcon supports all formats?

  Discussion posts and replies are publicly visible

  • The Falcon module allows you to extract text from images by using an embedded optical character recognition (OCR), or by using the Popchar engine (Pseudo optical character recognition)

    This feature allows you to extract element names from the screen or even complete texts.

    The embedded OCR used by the Appian RPA platform is tesseract. By default, the OCR includes support for Spanish and English, and for the rest of languages it should be checked.

    It is important to highlight that the OCR's reliability in extracting texts is not 100%. This involves that we should evaluate the feasibility of using this technology as we develop our robot. For example, if we want to extract the bank account and cannot check that the extracted text is correct, we shouldn't use OCR for this process.

    For more information visit  https://docs.appian.com/suite/help/20.2/rpa/modules/falcon-module.html

    Regarding the second question, I have not had any problem when using any type of format either JPEG or PNG, generally when I use Falcon I try to use the format PNG and work with it.

  • I attached a simple example:

    public void recognize() throws Exception {
    		
    		final BufferedImage defaultImage =
    				ImageIO.read(Paths.get(server.getCurrentDir(), "img", "img.png").toFile());
    	   
           if (defaultImage != null){
    			Rectangle rectanguloCif = new Rectangle(58, 15);
    
    			String text = falcon.extractText(MyRobot.convertToARGB(defaultImage), rectanguloCif, ETextFormInImage.TREAT_THE_IMAGE_AS_A_SINGLE_TEXT_LINE,
    					ELanguageInImage.SPANISH, null, 1.9f, 0f);
    
    			server.info("Resutl:" + text);
    
    		}else{
    			server.info("Null Image");
    		}
    	}
    	//without crop-off
    	public static BufferedImage convertToARGB(BufferedImage image)
    	{
    	    BufferedImage newImage = new BufferedImage(
    	        image.getWidth(), image.getHeight(),
    	        BufferedImage.TYPE_INT_ARGB);
    	    
    	    
    	    Graphics2D g = newImage.createGraphics();
    	    RenderingHints rh = new RenderingHints(
    	             RenderingHints.KEY_TEXT_ANTIALIASING,
    	             RenderingHints.VALUE_TEXT_ANTIALIAS_OFF);
    	    g.setRenderingHints(rh);
    	    g.drawImage(image, 36, 15, null);
    	    g.dispose();
    		
    		}
    		
    	//with crop-off
    	public static BufferedImage convertToARGB(BufferedImage image)
    	{
    			
    		BufferedImage img = image.getSubimage(0, 0, 58, 15); 
    		BufferedImage copyOfImage = new BufferedImage(img.getWidth(), img.getHeight(), BufferedImage.TYPE_INT_ARGB);
    		Graphics2D g = copyOfImage.createGraphics();
    		RenderingHints rh = new RenderingHints(
    	             RenderingHints.KEY_TEXT_ANTIALIASING,
    	             RenderingHints.VALUE_TEXT_ANTIALIAS_OFF);
    	    g.setRenderingHints(rh);
    		g.drawImage(img, 0, 0, null);
    		return copyOfImage; 
    	 
    	}
    	}

  • 0
    Certified Senior Developer
    in reply to jesusp0002

    Hey Jesus,

    We tried your code and it perfectly identifies the image but it is still not extracting the text out of it. We tried with only that part of the image where the text was there but still it didn't extract any text.

    Can you please help us on the same or can you provide the image which you used to run the above code?

    Thanks in advance!