Become a fan of Slashdot on Facebook


Forgot your password?

Startup Uses AI To Create Programs From Simple Screenshots ( 89

An anonymous reader shares an article: A new neural network being built by a Danish startup called UIzard Technologies IVS has created an application that can transform raw designs of graphical user interfaces into actual source code that can be used to build them. Company founder Tony Beltramelli has just published a research paper that reveals how it has achieved that. It uses cutting-edge machine learning technologies to create a neural network that can generate code automatically when it's fed with screenshots of a GUI. The Pix2Code model actually outperforms many human coders because it can create code for three separate platforms, including Android, iOS and "web-based technologies," whereas many programmers are only able to do so for one platform. Pix2Code can create GUIs from screenshots with an accuracy of 77 percent, but that will improve as the algorithm learns more, the founder said.
This discussion has been archived. No new comments can be posted.

Startup Uses AI To Create Programs From Simple Screenshots

Comments Filter:
  • by bugs2squash ( 1132591 ) on Monday May 29, 2017 @06:06PM (#54507477)
    I'm pretty sure code generators have been able to accept input from graphical layout editors for a while. Just what is this AI "inferring" ?
    • Re:no need for AI (Score:4, Interesting)

      by smallfries ( 601545 ) on Tuesday May 30, 2017 @03:26AM (#54509077) Homepage

      The output from a layout editor is a structured description of the components and their layout. This is inferring that description from a .png - the LSTM is building a description of the structural relationship between the widgets from the input image.

      It looks pretty cool, although quite simple. The intermediate token stream that it is inferring may be more interesting as a design tool than the neural network on the front-end that is building it.

  • by Anonymous Coward

    This just shows how easy it is to app an app that apps other apps, unlike LUDDITE software!


  • I RTFA (Score:5, Informative)

    by Anonymous Coward on Monday May 29, 2017 @06:17PM (#54507519)

    It only generates the layout files for the different platforms.

    • Re: (Score:3, Insightful)

      by kaizendojo ( 956951 )
      Which frees designers to work on UI and coders to concentrate on tighter code instead of the tedium of coding UIs from sketches. Everybody wins.
      • Re:I RTFA (Score:5, Interesting)

        by HornWumpus ( 783565 ) on Monday May 29, 2017 @06:58PM (#54507691)

        Yep, everybody did win. Back in 1990, when Rapid Application Development (RAD in the hype of the day) tools did this.

        Your IDE still has this feature. Drag and drop UI drawing is better than having any UI inferred from a drawing. How do you draw a mask?

        Trying to do this well, transparently for multiple different devices plus 'browser' is challenging, to say the least. But...GOOD NEWS...each of these markets is big enough to support a UI team of it's own. Claiming to do it well, automatically from a 'napkin sketch', for all significant platforms is braggadocious to the point where adults start to whisper about where the person's keeper is, calling him 'Sheldon'.

        But we all remember being 22 and doing similar; 'that's easy, just...' The time it takes from 'that's easy' to 'uhh....shit' is what separates success from failure, long term.

        The best, this will do is produce a 'wrong' (control behavior from a drawing?) UI for a 'sketch artist' who hasn't bothered to learn to use his IDE. Somebody still has to come along, muddling through the messes (one per target), and fix it.

        • It sure sounds a lot like what we were trying to do back then. From concept to skeleton gui and code, so that the business can participate in the design and get exactly what they think they want, without having some annoying nay-saying UI/UX guy in the middle. I suppose that is how we ended up with monstrosities like SAP.
          • Yeah, that would be great if this AI could actually create these stuff from ANY picture. BUT it can only generate stuff from pictures that have been made with predefined tools....... and that's where this is just all bullocks, as most mockup tools can already generate code for any given platform.
            This article seems more like a plug by the developers themselves as at the moment it doesn't add anything to the already existing platforms.. Also not to forget, this can only generate to frameworks it knows, so the

      • That "tedium" of coding the ui is usually the easiest part of app development, for me at least.

        Does the neural net also recognize dynamic UIs with swipe, pinch and twirl responses?

        Neat academic project... but 77% accuracy just to do the layouts? I've got interns better than that.

        • Re:I RTFA (Score:4, Interesting)

          by Fnkmaster ( 89084 ) on Monday May 29, 2017 @11:01PM (#54508501)

          Agreed. We built a nearly identical system (with OpenCV for morphological analysis and neural networks) about a year ago, as part of a larger AI-powered mobile app development platform.

          77% accuracy is not very impressive, but unclear what the training and test sets are here.

          The biggest functional win from this is actually getting sensible layout params from a designer's UI mockup - i.e. figuring out should this be right justified/left justified, should there be margin/padding here, etc. We solved that problem pretty well.

          Other challenges involve asset up-scaling, background image color extraction, etc. If you can take rough image mockups and output well polished asset packs, with vectorized images, layout files and stub code for developers to work with, that's a pretty significant win.

          We got that far with the project, but ended up shifting direction to a somewhat different market where there was more growth potential - literally nobody wanted to invest further in mobile app tools in 2016, AI powered or not.

          So yeah, cool proof-of-concept, but as a standalone offering this doesn't create much value. As part of a larger toolchain might be valuable.

        • 77% accuracy just to do the layouts?

          This is version 1.0. It will rapidly improve.

          I've got interns better than that.

          Interns have to be paid. Software works for (almost) nothing.

          You remind me of this guy:
          "The Americans have need of the telephone, but we do not. We have plenty of messenger boys." -- William Henry Preece, explaining why telephones would never be used in Britain.

  • 77% is not good. I would not say it's outperforming any coders.
    • Also bearing in mind the thing you are generating is a layout and it's being generated from a screenshot of that exact thing.

      What might have a bit of value is generating real layouts from UI mockups.

      • So you've got to

        1. Create the layout
        2. create a screanshot of it
        3. have the program generate the layout.

        Seems to me that steps 2 and 3 are redundant except for photoshop farts who think that a .psd file is all you need to "define your requirements".

        • Yes, pretty much. If you've created an image of the layout to the point that machine learning can interpret it then you might as well have just created the layout in the first place.
    • How many programmers have you met that could be given a UI screenshot, say "rightio" and go off & implement it on three platforms - without asking a single question?

      Maybe when the algorithm is also as experienced as a programmer that could do that, we can expect better.

      • The ones that could were/are _terrible_coders_. UI configuration is more complicated than a screenshot can capture. If they complete it from a screenshot without asking questions, fire their incompetent asses. They missed a lot, guaranteed, assumed even more...

        UI Mockup is a small part of systems analysis. Comes relatively late in the process. Mostly the UI 'falls out' of the data structure or data/business layers, depending on how you 'squint' at the process.

        But that's all old stuff, doesn't involve A

      • But it's not implementing the application on ANY platform. Not even the UI behaviour. Just the bare layout. As others have pointed out, we've had tools to do this since the '80s - and the generated code for default behaviours. And the ability to select custom behaviours rather than have to write the code. Even dBASEIV was far more capable than this overhyped "AI".
      • Xamarin is better
      • Basically every programmer I have ever met.
        Why would I have a question if I have a drawn mock up of a screen I should program?
        And why would there be questions because I do it once for iOS and then for Android?

  • RAD Redux (Score:4, Interesting)

    by Tablizer ( 95088 ) on Monday May 29, 2017 @06:39PM (#54507597) Journal

    We've had RAD systems for decades. They make the first 80% easy, but not the last 20%. One is always dealing with things like legacy databases with goofy schemas and domain-specific intricacies.

    Tools that may take longer to lay down the basics but can be tuned easier for specifies still seem the best bet.

    Plus you have issues of mobile devices such that UI's need to be "responsive" to different screen sizes. These can take a lot of experimentation to get right because context is involved. They are solving 1990's problems.

  • by dpbsmith ( 263124 ) on Monday May 29, 2017 @06:59PM (#54507701) Homepage

    Was the first version of ResEdit released in 1984 or 1985? In any case, for more than thirty years, there have been developer tools that allowed you to draw a UI screen, while simultaneously creating a WYSIWYG screen image, an object-oriented description of the elements in the image (e.g. "a checkbox at 50,100"), and code to generate the image.

    As nearly as I can tell, the only novelty here is the ability to work off a static image file, rather than being able to work off the time-sequence of the series of drawing manipulations used to draw the file. This wouldn't be a big deal even if it worked, since it doesn't take very long for a human to look at a UI screen and draw a duplicate layout using a UI layout tool.

    As for "77% accuracy," I have no idea what that means or how you calculate the percentage, but sounds like "it doesn't work," because the amount of work needed to correct something that is only 77% accurate is probably about the same--quite possibly more--than the amount of work needed to create it from scratch with a good layout tool.

    Furthermore, it is very common for a UI layout to contain elements that are only conditionally visible. An obvious one would be a tabbed panel. A screenshot can show you the control that are in the frontmost tab page, but has no information at all that would allow pix2code to even begin to guess at the controls and other elements that are present in the other tab fields. Therefore, to get even a complete visual record of the interface, it is necessary to have some kind of procedure or script that results in every UI element being systematically revealed. That's not trivial. (Imagine some of the currently fashionable designs that save screen real estate by putting larger parts of the UI on invisible trays that only slide into view when needed).

  • An accuracy of 77% is consider a passing C grade (unfortunately) in school and is completely unacceptable in the real world business environment. It needs to be getting an A+ to be acceptable in the real world. Fire the sucker.

  • by Anonymous Coward

    Sigh. Here we go: that isn't 'AI'. Can we please start calling it something more accurate and honest? Can millennials please step outside of their teenaged fantasies and join the rest of us here in reality? It would be so much easier. Double sigh.

  • Computer languages are almost all variations of English,(While, If, then, else, go, return, are all english words).

    But human derived language, particularly alphabet based ones, s are not appropriate for coding.

    Alphabet based languages are designed to represent an infinite set of words. Computer languages use a small set - often less than 100.

    • by dskoll ( 99328 )

      Um. Say what?

      Any set of symbols is an "alphabet" in computer science parlance. Furthermore, although computer languages have a limited number of keywords, they also have variable names, etc.

      Meh. I think parent is satire. Too subtle a WTF to be real.

    • Welcome to Simulink. We've been using it for a while to solve quite a few complex tasks.

    • I don't know what languages you're using, but every one I use allows you to define additional words to do other things. They're called functions, procedures, subroutines macros, whatever. I don't know any computer language that uses a small set of words. Even assembler allows you to use macros that you can call all sorts of names. And then there's labels for jumps in both low and high-level languages.

      Since you used english as an example, there are far more valid words in programming languages than in the

      • FWIW BASIC before 1983ish was like that. Subroutines were defined as the line numbers of the first statements in them (ie GOSUB 1000, not GOSUB DrawBox(x, y, 16, 24))

        Not that I'm disagreeing with you in spirit, just technically you're wrong, which is the best kind of wrong, or something ;-) You youngsters have it so good with your procedures and functions and classes and, uh, methods, and properties and... back in my day we had line numbers and variables with one or two letters maybe followed by a dollar

        • Well, except that there were other languages before basic that didn't use line numbers. That moldy golden oldie, assembler, jumped to offsets (usually defined as a symbolic name), not line numbers. But if you want to call me a youngster, at my age I'll take it as a compliment :-p

          BTW - you could use variables longer than 2 letters, just that only the first two were significant, so COUNT and COW were the same. As long as you kept that limitation in mind, it made for more readable code. And if you didn't, wel

          • Depended on the version of BASIC, I know most Microsoft versions did allow you to say "APPLE=2: PRINT APLPE" (sic) and it'd print 2, but there were quite a few that didn't and rigorously enforced the 1 or 2 letter limit.

            OTOH, the BASIC I used on the ZX81 (my first home computer!) did allow you unlimited variable names. And I guess as GO SUB (sic) accepted formulas in Sinclair BASICs (alas not Microsoft's) you could have used named subroutines (ie 10 LET DRAWBOX=1000 / 20 GO SUB DRAWBOX) so in that respec

  • You are familiar with the Historical Documents, are you not? []
  • I wrote OCR software years ago just for fun. Writing algorithms that could identify boundaries each typeset character from a scanned magazine page as well as images was pretty hard to do... but I did it... with 98% accuracy... and no real AI... just algorithms like SPF.

    What this guy did barely counts. It just identifies rectangles (poorly) and guesses what kind of rectangle (poorly) and spits out code.

    It's a cute project for a high school kid.
  • I'll give thing a screenshot of my program: a big button that says "Fix my problem," or maybe "Enhance". With that I guess I did the hard part, so now I'll just kick back and wait for the magic AI to generate the code to make it work..
  • A reminder of the day when you didn't need a crack team of engineers to produce a simple educational program or hyperlinked ebook. Every teacher a programmer. Every student a programmer producing value added content, whether 5 or 55 years of age.

    We talk about advancements in the industry but we've taken a giant step backward in terms of creative output. See 'Inigo Gets Out' []

  • The Pix2Code model actually outperforms many human coders

    Challenge accepted - see you at the coffee machine - remember your cup!

  • As ANYONE who writes programs with GUI's will tell you - a system like this can AT BEST write 10% of the code needed in an actual "Program"...because the GUI layer isn't that tough to write - and there is no way on god's green earth that something that looks at screenshots can infer how the other 90% of the code has to work. Add to this that if it's even 99% accurate - how will a human programmer fix the remaining 1% of the bugs?

    So - programmers everywhere are laughing at this's patently obvious

  • by sbaker ( 47485 ) on Tuesday May 30, 2017 @08:10AM (#54509743) Homepage

    I see questions on Quora and similar places from kids who are thinking of taking up a career as computer programmers - on commonly asked one is "If I become a programmer, will AI make my career obsolete?" - and this is a very valid concern. If I were a truck driver, I'd be really worried that self-driving trucks would take my job 5 years from now.

    This announcement (which effectively says to the layperson "Programmers are about to become obsolete") will have a chilling effect on those people who are just thinking about getting into this field.

    In truth - this AI program will never see the light of day - it can NEVER "write a program from screenshots" because the necessary information to do that isn't present in the screenshots - even in principle. What HAPPENS when you push this button? All the screenshots tell you is that there is a button...and MAYBE...if the screenshot is somehow linked to other might tell you that pressing the button takes you to another panel. What it doesn't tell you is that pressing that button caused the camera to take a photo, for the software to reconstruct a 3D image of a person from that photo, that this has to be sent off to the server to match other 3D images, that the resulting match produces that person's name - which the program is then given from the server - and which then results in that "NAME" field on the next GUI panel to be populated with an actual name and not the "John Doe" that the GUI designer put there so the programmer would know that this is where the name goes.

    By itself - this announcement can be laughed at and called bullshit by anyone who has anything to do with writing programs (and I'm 100% sure it's being laughed at right now) - but the CHILLING effect that such ridiculously over-stated claims make on those who might be considering entering the industry is a very, very bad thing.

  • Startup Uses Bullshit To Defraud Investors

  • Looks like a fun project, but ... the three platforms they mention have somewhat different UX guidelines; it's a lot 'deeper' problem to translate a gui into something idiomatically appropriate for the mentioned platforms.
    Not to say this isn't a good, interesting start on that very thing.

  • This will be great for automating the process of generating fake landing pages for phishing attacks. Think phishing worms.

Live free or die.