You can either put mics left and right far enough away to not be in the video shot. (As video recordings to wider and wider nowadays, from 4:3 to 16:9 to 21:9... that becomes more and more difficult.)
The other alternative is playback. Record the audio first, and then play in sync with the recording while recording the images. This is actually done a lot and sometimes you can tell in (professional) videos that the acoustics of the recording do not match the scene where the video is shot (audio acoustics are then too good for the video location) and sometimes you can even tell that the video is "played" using a different accordion than what was used for the audio.
I hope that you know that I already knew all of that.
My solution (which works REALLY well) is green screens, you can shoot "the talent" in any aspect ratio you want (I shoot in 4:3 or 3:2 at the highest resolutions the equipment offers) and place that in any frame that I want, usually at 16:9, I am not in to the cinematic or anamorphic resolutions which work best in situations you are trying to show off the surrounding scenes. 4:3 is the old TV standard and that was actually REALLY good for framing in an accordionist. With green screens it became easy to "carve out" the mics as long as the bellows didn't slip behind them while making a big bellows pull.
The thing is I am a stickler for quality, I absolutely hate shitty quality green screen work, even the flickering of a single hair on my head bothers me and I work and test so that in future videos its not there and I have to live within my personal and financial limitations.
Sound quality is also important for me... actually that is priority #1 (which is a bit of a blessing, because to do *good* video is a hundred times harder, but thankfully bad videos with good audio are easier to watch and more popular than good video with bad audio). I have never played to a video in synch (backing track aside!), I kind of feel it is cheating, but that said, I know that as my desire to do better videos in the future grows, that sometime it is going to happen, and it will be a lot of fun for me.
When I first started looking at videos of accordionists on Vimeo and YouTube, they are all static shots of someone sitting and playing, 99% of the time using the mics from their cellphone or camera. I moved fast to recording the audio separately and instantly my videos were better (to me! Everyone has the right to their own opinion, right?). Once that was done, I worked on that static look, I introduced multiple cameras, green screen and motion in to the videos and I enjoyed that a lot. The "look" now became near as important as the music of the video. My pleasure level looking at the results grew exponentially.
My big limitation is that I am ONE person doing it all alone (and that is something that I do not want to change). I am the performer, the musician, sound engineer, lighting engineer, set designer and videographer of multiple cameras and multiple recording devices. The professionals have sound studios with many people working every minutia with many more people covering every other task that happens.
They can move to different locations, shoot while syncing to pre-recorded music and move on... try doing that alone! Sure it is possible, and I will do it one day just to say I did it, but WOW, the amount of work involved compared to sitting in the basement, turning on the cellphone and playing a tune is astronomically more complex and difficult.
Looking back now, I think that I am up to something like 8 years of studying and testing (probably well over a thousand mini video tests and even more in hours invested), to see if I like an effect or if it appears what I want it to look like and I have many more hours coming simply because I enjoy it. I like making educational or "how I did it" videos, it's also part of what I like to do.
All that said, we're living in a wonderful time, technologically speaking, where a single person sitting on their bed in front of a small $100 laptop, a $50 audio interface and a fair quality mic or 2 can put out quality audio files so good that most people cannot tell you if it was made in the finest studios around today, or not!