Why Cinematography Keywords Matter

Most AI video prompts fail in the same place. The subject is fine. The action is fine. What's missing is the language a director or DP would use to describe how the camera sees it. That's the gap this guide fills.

AI video models are trained on enormous libraries of film, TV, advertising, and stock footage tagged with the technical vocabulary of the trade. Phrases like "low angle tracking shot" or "shallow depth of field with bokeh" map onto huge clusters of training examples. When you use that vocabulary, you give the model a direct route to a specific visual outcome.

Prompts written in everyday language give the model a thousand directions to wander. Prompts written in cinematography language give it one.

Shot Sizes

Shot size is how much of the subject and scene fits in the frame. It is the single most powerful framing decision you can make in a prompt, and the easiest one to get wrong by leaving it out.

Shot Size	Subject	Use Cases
Extreme Wide Shot (EWS)	Tiny within the environment	Setting scale, isolation, establishing place
Wide Shot (WS)	Full body with environment around it	Action, dance, full body product shots
Medium Shot (MS)	Waist up	Dialogue, presenter-style content
Medium Close-Up (MCU)	Chest up	Interviews, conversational content with hand gestures
Close-Up (CU)	Face filling most of the frame	Emotional beats, reaction shots, thumbnails
Extreme Close-Up (ECU)	A single feature. Eyes, lips, product detail	Texture, detail, drama

Pro tip: If the model gives you a shot at the wrong distance, naming the shot size explicitly tends to fix it faster than describing the size with adjectives.

Camera Angles

Angle refers to the position of the camera in relation to the subject in your video. It changes the emotional reading of the shot before any other element does.

You'll get a sense of how these can be mixed and matched in scenes with practice, but here's the basics:

Angle	Camera Position	Emotional Effect
Eye Level	Camera at subject's eye height	Neutral, conversational, equal footing
Low Angle	Camera below subject, pointing up	Powerful, dominant, heroic
High Angle	Camera above subject, pointing down	Small, vulnerable, observed
Dutch Angle	Camera tilted off horizontal	Unease, chaos, disorientation
Bird's Eye View	Camera directly overhead	Pattern, geometry, god-like detachment
Worm's Eye View	Camera at ground level, pointing straight up	Towering, surreal, dramatic
Over-the-Shoulder (OTS)	Camera behind one subject, facing another	Conversation, intimacy, perspective

Check out how just changing the camera angle makes a scene feel completely different, and the subject feels more vs. less in control of the scene.

High angle:

Low/medium angle:

Pro tip: If the model gives you a shot at the wrong distance, naming the shot size explicitly tends to fix it faster than describing the size with adjectives. With Influencer Studio's Cinema Mode, you don't have to worry about that because you can select it as one of the settings on the shot.

Camera Movement

From the standpoint of technical capabilities, movement is where AI video has improved most dramatically over the last 18 months. Prompting precisely is the difference between a static scene with a wobble and a shot that feels directed and intentional.

Movement	What it does
Static	Camera locked off. No movement.
Slow push in	Camera physically moves toward the subject. Builds tension or intimacy.
Slow pull out	Camera physically moves away from the subject. Creates scale or separation.
Zoom in	Optical, not physical. Lens tightens. Reads as tension or unease.
Zoom out	Optical, not physical. Lens widens. Reads as reveal or detachment.
Pan left	Camera rotates horizontally left from a fixed point.
Pan right	Camera rotates horizontally right from a fixed point.
Tilt up	Camera rotates vertically upward from a fixed point.
Tilt down	Camera rotates vertically downward from a fixed point.
Orbit left	Camera circles the subject to the left. Good for reveals and product shots.
Orbit right	Camera circles the subject to the right. Good for reveals and product shots.
Handheld	Slight natural imperfection in movement. Documentary feel, intimacy, urgency.

Something as simple as whether the camera is pushing in on the subject is pulling away from them turns an intense scene into an expositional reference.

Adding or removing elements of the surroundings gives you control over what the viewer sees, but also what the viewer feels.

Take a look at this scene where the camera closes in on our subject:

And compare it with this one, where the camera is moving away:

Can you sense how one makes you focus on the moment while the other makes you focus on the context around the moment?

Pro tip: One movement per prompt. Stacking competing instructions like "slow push in with a pan left and handheld movement" gives the model nowhere to go, and it will have to guess which movement to use. Pick the one that serves the shot.

Framing and Composition

Composition is how elements are arranged inside the frame. AI models respond well to a small set of named compositional rules. You can experiment with these and discover how some overlap, but here are some general expectations you can have.

Technique	What it does
Rule of thirds	Subject placed on a third line rather than dead center. Naturally pleasing, leaves room for context.
Centered composition	Subject perfectly centered. Formal, intentional. Wes Anderson or Kubrick references land well here.
Leading lines	Roads, hallways, fences, light beams that draw the eye toward the subject.
Frame within a frame	Subject seen through a doorway, window, or archway. Adds depth and context.
Negative space	Large empty area around the subject. Loneliness, scale, or room for a text overlay in social content.
Foreground/midground/background	Three layers of depth. Naming all three in a prompt almost always produces a more dimensional shot.

Composition is arguably the most "artistic" choice to make about a scene. Many directors make a certain type of composition their trademark and once you see it, it's impossible to miss.

While most of the other decisions about a scene have relatively predictable outcomes (wide shots lead to an epic feeling and so on), framing is a bit ambiguous. The same sequence, framed differently, can either ground the action or completely disorient the viewer (sometimes both).

If we put the subject dead center in the framing, the viewer understands what the focus of the scene is and what the shot is about.

But if you leave enough space around the subject, there's no clear focus and no obvious focal point for the scene.

Pro tip: When you want the model to leave room for a logo, caption, or product overlay in social ad creatives, say so! Try including something like "negative space on the right for text overlay" in your prompt.

Depth of Field

Depth of field is what separates a shot that looks like a phone clip and a shot that looks like it was taken on a cine lens. AI models handle this concept well when you name it explicitly.

Technique	What it does
Shallow depth of field	Narrow plane of focus. Background blurs into bokeh. Cinematic, intimate, isolates the subject.
Deep depth of field	Foreground and background both sharp. Documentary, landscape, environmental.
Bokeh	The quality of the out-of-focus areas. Specify "creamy bokeh" or "anamorphic bokeh" for character.
Rack focus	Focus shifts from one subject to another within the same shot. A directed reveal.
Macro focus	Extreme close focus on tiny detail. Pollen, water droplets, fabric weave.

Depth of field is less about aesthetics and more about attention. It tells the viewer where to look... and just as importantly, what to ignore. A shallow depth of field collapses the world down to one thing (useful for when you want viewers to focus on you product!). A deep depth of field says everything in front of you is relevant.

Lighting

Lighting often goes unnoticed because we're just so used to seeing certain standard lighting setups. Often, you don't even realize how lighting is adding to the emotional weight of a scene.

However, it makes all the difference when you get it wrong. Especially with AI video, lighting is the thing that sets apart slop from engaging cinematography.

Most AI models have a strong default style, often slightly oversaturated and evenly lit. Naming a lighting setup overrides that default and can make your scene lose that "AI look" and feel.

Setup	What it does
Three-point lighting	Key light, fill light, backlight. The studio standard. Clean, professional, talking head ready.
Rembrandt lighting	Key light at 45 degrees creating a small triangle of light on the shadow-side cheek. Classic portrait look.
Golden hour	Warm, low-angle sunlight in the hour after sunrise or before sunset. Soft, flattering, cinematic shorthand for emotion.
Blue hour	The brief window after sunset. Cool, moody, dusk atmosphere.
Practical lighting	Light sources visible in the frame: lamps, neon signs, candles, screens. Realism and atmosphere.
Backlight / rim light	Light source behind the subject. Creates a halo and separates the subject from the background.
Silhouette	Subject completely dark against a bright background. Mood, mystery, anonymity.
High key	Bright, low contrast, evenly lit. Beauty, fashion, comedy.
Low key	Dark, high contrast, deep shadows. Drama, thriller, noir.
Chiaroscuro	Strong contrast between dark and light within the same frame. Painterly, Caravaggio-adjacent.
Hard light	Produces sharp-edged shadows. Directional, dramatic, unforgiving.
Soft light	Wraps around the subject. Flattering, natural, commercial.

Of all the cinematography decisions you can make, lighting has the most immediate impact on mood. The same subject, the same framing, the same movement; lit differently, you have a completely different scene.

Check out this shot in standard three-point lighting:

Compare that with a shot with harsh overhead lighting:

The difference isn't subtle. Three-point lighting says: this person is in control, this is a professional environment, trust what they're telling you. Harsh overhead lighting says something is wrong. Same person, same room, completely different story.

Pro tip: When a generation comes out flat, the issue is usually lighting direction, not lighting amount. Specify where the key light is coming from. "Key light from camera left at 45 degrees" gives the model a clear instruction.

Putting It Together: The Cinematography Stack

Every section of this guide covers one decision.

When you combine them in a single prompt, the results improve dramatically. Think of it as a checklist you run through before you commit to a generation.

Start with shot size.

How far is the camera from the subject? Then decide on an angle. Then movement. Then think about where the subject sits in the frame, what the lighting is doing, what is in focus, and finally, whether a film reference would push the look in a specific direction.

A prompt that answers all seven of those questions will outperform one that answers two or three, almost every time. Not because the model needs the information but because the specificity removes the guesswork, and guesswork is where AI video goes wrong.