Creating the code for this app was an journey, particularly diving into the complexities of digicam views and picture classification. Regardless of encountering obstacles alongside the way in which, the assist of buddies and collaborative brainstorming classes helped me discover my manner. Every problem turned a chance for studying and enchancment, as I navigated by way of trial and error to raised perceive these applied sciences. Ultimately, the arduous work paid off, and I’m pleased with what I achieved. It was a journey crammed with ups and downs, however the sense of accomplishment made all of it worthwhile.
Right here comes essentially the most difficult a part of this app improvement and I’ll let you know why
Making a mannequin with CoreML for this venture was fairly difficult. Initially, I aimed to categorise poses into two classes: “flawed pose,” which included knowledge the place the legs and arms have been too far aside, and “proper pose,” which included knowledge the place the legs and arms have been shut collectively, particularly hand-to-hand and leg-to-leg.
Nevertheless, I encountered a difficulty the place sitting was incorrectly labeled as a “proper pose.” To deal with this, I added a 3rd class, “others,” which included photographs of individuals sitting and standing. Sadly, this led to a different downside: the mannequin labeled the right pose as “others” with 100% confidence, inflicting appreciable frustration.
Once we are misplaced, we have to ask others
After additional investigation and dialogue with buddies, I spotted the information wanted a extra generalized sample reasonably than detailed specifics. Consequently, I redefined the courses: “proper pose” particularly contained the Tadasana yoga pose, whereas “flawed pose” included a mixture of different yoga poses.
Lastly, this revised mannequin labored as supposed, precisely predicting the poses as deliberate.
Throughout this course of, I encountered a number of challenges when making an attempt to place the digicam view and different UI parts in a 60:40 ratio, with the UI parts on the left. The first difficulty was that the digicam view wouldn’t correctly fill the correct facet of the web page. For the reason that app is designed for use in panorama mode, the digicam view additionally displayed in panorama format. Nevertheless, I wanted the digicam view to be in portrait orientation.
I experimented with many various options, however none of them labored. The digicam view both remained in panorama mode or may very well be compelled into portrait mode however displayed a rotated view, as if the digicam was being utilized in a portrait app.
It’s difficult and generally appears not possible, however we should all the time discover a solution to improvise
Lastly, after additional investigation, I found that the digicam view could be set to a customized measurement, however this ends in it being zoomed in. Though this zooming impact was not supreme, it allowed me to suit the digicam view into the specified 60:40 structure ratio. By adjusting the digicam view’s measurement and accepting the zoomed-in show, I used to be in a position to obtain a useful and visually interesting structure. This method ensured that the digicam view displayed accurately alongside the opposite UI parts, offering a greater consumer expertise.
Right here is the code that i used for make the digicam view’s measurement could be customise with consequence it is going to be zoomed in.
func setupCamera() {
captureSession = AVCaptureSession()
guard let captureDevice = AVCaptureDevice.default(.builtInWideAngleCamera, for: .video, place: .entrance),
let enter = attempt? AVCaptureDeviceInput(system: captureDevice) else { return }captureSession.addInput(enter)
let videoOutput = AVCaptureVideoDataOutput()
videoOutput.setSampleBufferDelegate(self, queue: DispatchQueue(label: "videoQueue"))
captureSession.addOutput(videoOutput)
let previewLayer = AVCaptureVideoPreviewLayer(session: captureSession)
previewLayer.videoGravity = .resizeAspectFill
previewLayer.connection?.videoOrientation = .landscapeRight
previewLayer.body = previewFrame
view.layer.addSublayer(previewLayer)
captureSession.startRunning()
}
To combine the ML mannequin with the view, I configured the arrogance stage to 0.8. This implies the poseStatus variable will solely replace and be displayed on the left facet of the view if the mannequin’s confidence is at the least 0.8. By implementing this threshold, the poseStatus won’t change for decrease confidence ranges, leading to extra secure and correct suggestions for the consumer.
OnePose Yoga is now a totally useful app, although it does have a minor bug. For those who’re occupied with exploring the code behind the event of this app, be happy to take a look at the GitHub repository!