[2.1] metal compute_hq shader

Discussion area about developing with Ogre-Next (2.1, 2.2 and beyond)


Post Reply
cloud
Gremlin
Posts: 196
Joined: Tue Aug 08, 2006 6:45 pm
x 14

[2.1] metal compute_hq shader

Post by cloud »

Sadly it turns out MipmapsGaussianBlur_cs.metal won't compile on Mac

compiler error

Code: Select all

/System/Library/PrivateFrameworks/GPUCompiler.framework/Versions/A/lib/clang/3.5/include/metal/metal_texture:866:21: note: candidate disabled: 'lod' argument must be known at compile-time
as talked about here

https://forums.developer.apple.com/thread/75586

so code like

Code: Select all

			outputImage.write( float4( outColour[ @iPixel ], 1.0 ), uint2( i2Center +  @iPixel * i2Inc ),
							   p.dstLodIdx );@end
will compile iff p.dstLodIdx is a fixed constant not a shader param, any suggestion for how I should go about trying to fix this in ogre?

In the link above the first suggestion looks the easiest, but ugly, suggests a something like a vector of materials that grows whenever theres a demand for another mipmap level, this would go in CompositorPassMipmap I suppose.
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5299
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1280
Contact:

Re: [2.1] metal compute_hq shader

Post by dark_sylinc »

Yes, this is a known issue.

The solution would be to do the same we do in D3D11, but we haven't found yet the time to do it. I haven't got the time around it yet, and this problem is too specific as berserkerviking is yet getting used to Ogre (which is currently the new Metal on macOS maintainer).

It shouldn't be too hard to fix it though, but not trivial either.
cloud
Gremlin
Posts: 196
Joined: Tue Aug 08, 2006 6:45 pm
x 14

Re: [2.1] metal compute_hq shader

Post by cloud »

so I added

Code: Select all

                    IdString mipLevel("mip_level");
                    blurH2->setProperty(mipLevel, mip );
                    blurV2->setProperty(mipLevel, mip );
into CompositorPassMipmap::setupComputeShaders

and changed @piece( image_store ) of MipmapsGaussianBlur_cs.metal to be

Code: Select all

    @property( downscale_lq )
		@foreach( 2, iPixel )
			outputImage.write( float4( outColour[ @iPixel ], 1.0 ),
                              uint2( i2Center +  @iPixel * i2Inc ),
                              level( @value( mip_level ) ) );@end
	@end @property( !downscale_lq )
		@foreach( 2, iPixel )
			outputImage.write( float4( (outColour[ @iPixel * 2 ] + outColour[ @iPixel * 2 + 1 ]) * 0.5, 1.0 ),
							   uint2( i2Center +  @iPixel * i2Inc ),
							   level( @value( mip_level ) )  );@end

I didn't realise it before but each mipmap was create by cloned compute

but now I get a new annoying set of errors which are much worse

Code: Select all

Metal SL Compiler Error in 2GaussianBlurBase_cs:
Compilation failed: 

<program source>:265:16: error: no matching member function for call to 'write'
                        outputImage.write( float4( outColour[ 0 ], 1.0 ),
   ~~~~~~~~~~~~^~~~~
/System/Library/PrivateFrameworks/GPUCompiler.framework/Versions/A/lib/clang/3.5/include/metal/metal_texture:866:21: note: candidate function not viable: no known conversion from 'uint2' (aka 'vector_uint2') to 'ushort2' (aka 'vector_ushort2') for 2nd argument
    METAL_FUNC void write(vec<T,4> color, ushort2 coord, ushort lod = 0) METAL_VALID_LOD_ARG(lod) {
                    ^
/System/Library/PrivateFrameworks/GPUCompiler.framework/Versions/A/lib/clang/3.5/include/metal/metal_texture:857:21: note: candidate disabled: 'lod' argument value must be 0
    METAL_FUNC void write(vec<T,4> color, uint2 coord, uint lod = 0) METAL_VALID_LOD_ARG(lod) {
                    ^                                                ~~~~~~~~~~~~~~~~~~~~~~~~
<program source>:268:16: error: no matching member function for call to 'write'
                        outputImage.write( float4( outColour[ 1 ], 1.0 ),
"disabled: 'lod' argument value must be 0"

So when I said "will compile iff p.dstLodIdx is a fixed constant not a shader param" I guess I tried 0 and then assumed.

Seems I've been good a truly flumuxed by apple.
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5299
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1280
Contact:

Re: [2.1] metal compute_hq shader

Post by dark_sylinc »

Yeah, the only solution is to create a texture "view".

As a workaround, you can comment out the "//compute_hq", which will use a faster, lower quality filter.
berserkerviking
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 63
Joined: Tue May 02, 2017 8:15 pm
x 16

Re: [2.1] metal compute_hq shader

Post by berserkerviking »

We encountered this same problem earlier this summer when the gaussian blur shader for ScreenSpaceReflections wouldn't compile.
The problem is that macOS Metal (1.2) only supports the lod=0 variant of texture.write. Variable lod is supported only on Metal iOS.
I'm not sure why this is the case, but I suspect it is because lod=0 is the only feature that all mac gpus (Intel, AMD, Nvidia) all support.

Now that I have integrated the Sample_TutorialCompute01_UavTexture fix from cloud I will begin to look into this problem.
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5299
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1280
Contact:

Re: [2.1] metal compute_hq shader

Post by dark_sylinc »

berserkerviking wrote:We encountered this same problem earlier this summer when the gaussian blur shader for ScreenSpaceReflections wouldn't compile.
The problem is that macOS Metal (1.2) only supports the lod=0 variant of texture.write. Variable lod is supported only on Metal iOS.
I'm not sure why this is the case, but I suspect it is because lod=0 is the only feature that all mac gpus (Intel, AMD, Nvidia) all support.

Now that I have integrated the Sample_TutorialCompute01_UavTexture fix from cloud I will begin to look into this problem.
Since it looks like you'll be tackling this much sooner than I'll be able to:

The solution is to do the same we do with D3D11. Like you've guessed, I found out not all desktop GPUs support writing to LOD levels other than 0.

For Compute Shaders, in D3D11 we create an UnorderedAccessView of the given slice and mipmap:

Code: Select all

void D3D11RenderSystem::_bindTextureUavCS( uint32 slot, Texture *texture,
                                           ResourceAccess::ResourceAccess access,
                                           int32 mipmapLevel, int32 textureArrayIndex,
                                           PixelFormat pixelFormat )
{
    if( texture )
    {
        D3D11Texture *dt = static_cast<D3D11Texture*>( texture );
        ID3D11UnorderedAccessView *uavView = dt->getUavView( mipmapLevel, textureArrayIndex, pixelFormat );
        mDevice.GetImmediateContext()->CSSetUnorderedAccessViews( slot, 1, &uavView, NULL );

        mMaxBoundUavCS = std::max( mMaxBoundUavCS, slot );
The key part is:

Code: Select all

dt->getUavView( mipmapLevel, textureArrayIndex, pixelFormat );
Where the texture keeps a cache of ID3D11UnorderedAccessView pointers.

From what I can see, Metal's closest equivalent are "texture views". Metal should just mimic what D3D11 is doing so that when the texture view is bound, the texture view gets cached, bound and thus the selected mip level becomes mip 0.

This will fix the problem (well... the compute shader needs to be modified so that the lod is hardcoded to 0).

Later on, for rendering we haven't implemented it yet because it was macOS only, as iOS doesn't support this; but the same is done in D3D11RenderSystem::queueBindUAV where we also create an UnorderedAccessView of the given slice and mipmap but without caching it in Texture:

Code: Select all

switch( texture->getTextureType() )
{
case TEX_TYPE_1D:
    descUAV.ViewDimension = D3D11_UAV_DIMENSION_TEXTURE1D;
    descUAV.Texture1D.MipSlice = static_cast<UINT>( mipmapLevel );
    break;
case TEX_TYPE_2D:
    descUAV.ViewDimension = D3D11_UAV_DIMENSION_TEXTURE2D;
    descUAV.Texture2D.MipSlice = static_cast<UINT>( mipmapLevel );
    break;
case TEX_TYPE_2D_ARRAY:
    descUAV.ViewDimension = D3D11_UAV_DIMENSION_TEXTURE2DARRAY;
    descUAV.Texture2DArray.MipSlice         = static_cast<UINT>( mipmapLevel );
    descUAV.Texture2DArray.FirstArraySlice  = textureArrayIndex;
    descUAV.Texture2DArray.ArraySize        = static_cast<UINT>( texture->getDepth() -
                                                                 textureArrayIndex );
    break;
case TEX_TYPE_3D:
    descUAV.ViewDimension = D3D11_UAV_DIMENSION_TEXTURE3D;
    descUAV.Texture3D.MipSlice      = static_cast<UINT>( mipmapLevel );
    descUAV.Texture3D.FirstWSlice   = 0;
    descUAV.Texture3D.WSize         = static_cast<UINT>(texture->getDepth());
    break;
default:
    break;
}

D3D11Texture *dt = static_cast<D3D11Texture*>( texture.get() );

HRESULT hr = mDevice->CreateUnorderedAccessView( dt->getTextureResource(), &descUAV,
                                                 &mUavs[slot] );
Why is that? I don't remember. It probably has to do with the fact that Compute Shaders were added a little bit later after UAVs on rendering pipelines.

Cheers
Matias
berserkerviking
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 63
Joined: Tue May 02, 2017 8:15 pm
x 16

Re: [2.1] metal compute_hq shader

Post by berserkerviking »

Cloud: what is the specific program that uses the blur shader that you are trying to get to work?
berserkerviking
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 63
Joined: Tue May 02, 2017 8:15 pm
x 16

Re: [2.1] metal compute_hq shader

Post by berserkerviking »

Mat: could you please clarify what you meant when you said: "Later on, for rendering we haven't implemented it yet because it was macOS only, as iOS doesn't support this"
What is the "this" that iOS doesn't support?
berserkerviking
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 63
Joined: Tue May 02, 2017 8:15 pm
x 16

Re: [2.1] metal compute_hq shader

Post by berserkerviking »

Matias: Is it okay that I called you "Mat?" I often abbreviate people's names in my notes. But I try not to abbreviate
when talking to the person; some people are sensitive to that.
berserkerviking
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 63
Joined: Tue May 02, 2017 8:15 pm
x 16

Re: [2.1] metal compute_hq shader

Post by berserkerviking »

I agree with Matias that texture views should be the first thing we try. While I don't know for a fact that this will work, the documentation
certainly indicates that it will solve the problem. If it turns out that it doesn't work, then the function constant lod approach would
be the thing to try. But this much less desirable because changing the lod parameter will initiate a new back-end compile of the
shader. The problem is that this will happen during the frame rendering and will adversely affect the frame rate. (Although it
wouldn't be as horrible as a full compile, which also includes a front-end compile). But it would still be pretty bad.
(BTW: in case you aren't aware, the front-end compiler compiles the shader into an intermediate format and the back-end compile compiles
this intermediate format into the GPU-specific instructions. I seem to recall that the front-end compile is slower than the back-end compile).
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5299
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1280
Contact:

Re: [2.1] metal compute_hq shader

Post by dark_sylinc »

berserkerviking wrote:But this much less desirable because changing the lod parameter will initiate a new back-end compile of the
shader.
Somewhere in Metal docs (I think it was in Metal-Shading-Language-Specification-2.0.pdf) that lod must not only be constant, but also 0 (and if I read the post correctly, OP has already tried this and failed).
berserkerviking
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 63
Joined: Tue May 02, 2017 8:15 pm
x 16

Re: [2.1] metal compute_hq shader

Post by berserkerviking »

Matias: do you know what sample program cloud is trying to get to work?
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5299
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1280
Contact:

Re: [2.1] metal compute_hq shader

Post by dark_sylinc »

I have no idea what sample. I know SSR & PlanarReflections depend on this.

If you need a simpler script to debug (SSR is quite complex) you can modify ShadowMapDebugging like this to trigger the issue:

In Samples/Media/2.0/scripts/Compositors/ShadowMapDebugging.compositor replace ShadowMapDebuggingRenderingNode with this one:

Code: Select all

compositor_node ShadowMapDebuggingRenderingNode
{
	in 0 rt_renderwindow
	
	texture dummy 1024 1024 PF_R8G8B8A8 mipmap -1 no_fsaa no_gamma uav depth_pool 0
	
	target dummy
	{
		pass clear
		{
			colour_value 1 1 1 1
		}

		pass generate_mipmaps
		{
			mipmap_method compute_hq
		}
	}

	target rt_renderwindow
	{
		pass render_scene
		{
			load
			{
				all				clear
				clear_colour	0.2 0.4 0.6 1
			}
			store
			{
				colour	store_or_resolve
				depth	dont_care
				stencil	dont_care
			}
			overlays	on
			shadows		ShadowMapDebuggingShadowNode
		}
	}
}
The first thing it will do would be to clear "dummy" then generate mipmaps on it, and it will trigger the problem. This should be much easier for you to analyze than SSR.

Edit: If you meant the shaders, the relevant shaders and assets are in:
  • Samples/Media/2.0/scripts/materials/Common/Mipmaps.material.json
  • Samples/Media/2.0/scripts/materials/Common/GaussianBlurBase_cs.metal
  • Samples/Media/2.0/scripts/materials/Common/MipmapsGaussianBlur_cs.metal
And the C++ code that handles this is isolated to:
OgreMain/src/Compositor/Pass/PassMipmap/OgreCompositorPassMipmap.cpp
The C++ treats Metal in a special way (different from D3D11 & GL) whenever it it does:

Code: Select all

blurV2->getShaderParams( "metal" );
(note blurV2 is not the only variable doing this) where it pushes the LOD to a constant buffer.
Basically all code related to that would go away (including paramDstLodIdx from what I can see). What is related to blurV2->getShaderParams( "default" ); is common to all three APIs and should stay.
berserkerviking
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 63
Joined: Tue May 02, 2017 8:15 pm
x 16

Re: [2.1] metal compute_hq shader

Post by berserkerviking »

Hi Matias,

Here is an overview of the changes I would suggest to implement the fix you suggest. Please review.

1. MipmapGaussianBlur_cs.metal
Ignore p.dstLodIdx
Use 0 instead (2 locations)

2. Add a new member to MetalTexture
// Used for writing to non-zero mip levels of a texture when it is used as a UAV
vector<MetalTexture*> mMipmapViews

3. Modify MetalRenderSystem::queueBindUAV

Code: Select all

   if( !mUavs[slot].buffer && mUavs[slot].texture.isNull() && texture.isNull() )
	return;

    >>> Insert added code here

    mUavs[slot].texture = texture;
Added code (pseudocode):

Code: Select all

    if mipmapLevel > 0
        texview = mMipmapViews[mipLevel]
        if texview doesn't exist
            // Create
            texview = create MetalTexture where base texture is created with:
                [mtldevice newTextureViewWithPixelFormat:pixelFormat
                           textureType:<same as texture>
                           levels:<range mipLevel to mipLevel>
                             slices:<range textureArrayIndex to textureArrayIndex>]
    	    add texView	to proper miplevel slot of mMipmapViews
        texture = texview
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5299
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1280
Contact:

Re: [2.1] metal compute_hq shader

Post by dark_sylinc »

1. MipmapGaussianBlur_cs.metal
Ignore p.dstLodIdx
Use 0 instead (2 locations)
Yup
2. Add a new member to MetalTexture
// Used for writing to non-zero mip levels of a texture when it is used as a UAV
vector<MetalTexture*> mMipmapViews
Yup, though I'd use mUavViews or just mViews or mTexturesViews, etc; because I strongly suspect they'll be needed for more than just mipmaps in the future.
3. Modify MetalRenderSystem::queueBindUAV

Code: Select all

   if( !mUavs[slot].buffer && mUavs[slot].texture.isNull() && texture.isNull() )
	return;

    >>> Insert added code here

    mUavs[slot].texture = texture;
Added code (pseudocode):

Code: Select all

    if mipmapLevel > 0
        texview = mMipmapViews[mipLevel]
        if texview doesn't exist
            // Create
            texview = create MetalTexture where base texture is created with:
                [mtldevice newTextureViewWithPixelFormat:pixelFormat
                           textureType:<same as texture>
                           levels:<range mipLevel to mipLevel>
                             slices:<range textureArrayIndex to textureArrayIndex>]
    	    add texView	to proper miplevel slot of mMipmapViews
        texture = texview
Sounds fine, remember _bindTextureUavCS also needs to be modified. _bindTextureUavCS => Compute Shaders, queueBindUAV => Graphics pipeline.
Push your changes to another branch for review (you can push to v2-1-metal-macos if you want; you may want to first merge v2.1->v2-1-metal-macos to keep it up to date)
cloud
Gremlin
Posts: 196
Joined: Tue Aug 08, 2006 6:45 pm
x 14

Re: [2.1] metal compute_hq shader

Post by cloud »

a bit late I know and you probably already have test code, but if it helps debugging

my code based on sample code, changed a bit

compositor HgSAO.compositor

Code: Select all

compositor_node GenMipmapsTest_Node
{
	in 0 rt_renderwindow
	
	texture rt0 target_width target_height PF_R8G8B8 mipmaps 5 no_gamma no_fsaa uav

	target rt0
	{
		pass clear
		{
			colour_value 0.2 0.4 0.6 1
		}

		pass render_scene
		{
			overlays			off
		}
		
		pass generate_mipmaps
		{
			// compute_hq won't work on Mac use MetalTexture::_autogenerateMipmaps instead
			mipmap_method compute_hq
		}
	}


	target rt_renderwindow
	{
		pass render_quad
		{
			material HgSAO
	    	input 0 rt0
	    	
			quad_normals	camera_far_corners_view_space
		}
		
		pass render_scene
		{
			lod_update_list	off

			//Render Overlays
			overlays	on
			rq_first	254
			rq_last		255
		}
	}
}

workspace HgSAO_Workspace
{
	connect_output GenMipmapsTest_Node 0
}
material HgSAO.material

Code: Select all


// GLSL shaders
vertex_program HgSAO_vs_GLSL glsl
{
	source HgSAO_vs.glsl
}

fragment_program HgSAO_ps_GLSL glsl
{
	source HgSAO_ps.glsl
	default_params
	{
		param_named depthTexture int 0
	}
}

// HLSL shaders
vertex_program HgSAO_vs_HLSL hlsl
{
    source HgSAO_vs.hlsl
    entry_point main
    target vs_5_0 vs_4_0 vs_4_0_level_9_1 vs_4_0_level_9_3
}

fragment_program HgSAO_ps_HLSL hlsl
{
	source HgSAO_ps.hlsl
	entry_point main
	target ps_5_0 ps_4_0 ps_4_0_level_9_1 ps_4_0_level_9_3
}

// Metal shaders
vertex_program HgSAO_vs_Metal metal
{
	source HgSAO_vs.metal
}

fragment_program HgSAO_ps_Metal metal
{
	source HgSAO_ps.metal
	shader_reflection_pair_hint HgSAO_vs_Metal
}

// Unified definitions
vertex_program HgSAO_vs unified
{
	delegate HgSAO_vs_HLSL
	delegate HgSAO_vs_GLSL
	delegate HgSAO_vs_Metal
	
	default_params
    {
        param_named_auto worldViewProj worldviewproj_matrix
    }
}

fragment_program HgSAO_ps unified
{
	delegate HgSAO_ps_HLSL
	delegate HgSAO_ps_GLSL
	delegate HgSAO_ps_Metal
}

// Material definition
material HgSAO
{
	technique
	{
		pass
		{
			depth_check off
			depth_write off

			cull_hardware none

			vertex_program_ref HgSAO_vs
			{
			}

			fragment_program_ref HgSAO_ps
			{
			}

			texture_unit
			{
				filtering			none none point
				tex_address_mode	clamp
				texture 			BeachStones.jpg
			}
		}
	}
}
vertex shader HgSAO_vs.metal

Code: Select all

#include <metal_stdlib>
using namespace metal;

struct VS_INPUT
{
	float4 position	[[attribute(VES_POSITION)]];
	float3 normal	[[attribute(VES_NORMAL)]];
	float2 uv0		[[attribute(VES_TEXTURE_COORDINATES0)]];
};

struct PS_INPUT
{
	float2 uv0;
	float3 cameraDir;

	float4 gl_Position [[position]];
};

vertex PS_INPUT main_metal
(
	VS_INPUT input [[stage_in]],

	constant float4x4 &worldViewProj [[buffer(PARAMETER_SLOT)]]
)
{
	PS_INPUT outVs;

	outVs.gl_Position	= ( worldViewProj * input.position ).xyzw;
	outVs.uv0			= input.uv0;
	outVs.cameraDir		= input.normal;

	return outVs;
}


fragement shader HgSAO_ps.metal

Code: Select all

#include <metal_stdlib>
using namespace metal;

struct PS_INPUT
{
	float2 uv0;
	float3 cameraDir;
};

fragment float4 main_metal
(
	PS_INPUT inPs [[stage_in]],
	texture2d<float>	depthTexture	[[texture(0)]],
	sampler				samplerState	[[sampler(0)]],

	constant float2 &projectionParams	[[buffer(PARAMETER_SLOT)]]
)
{
    float3 fColor = depthTexture.sample( samplerState, inPs.uv0, level(2)  ).xyz;
    return float4( fColor, 1.0);
}
comment out mipmap_method compute_hq or change to api_default to get auto mipmaps of BeachStones.jpg

I wanted it for an HgSAO. A nice occulsion tech I had working in in an old ogre I did it using a compositor pass listener and generated mipmaps of depth Gbuffer with render calls in the listener, I've got a video of it https://vimeo.com/15932418.


Entirely separate issue, I post for interest.
As I understand it the metal depth texture doesn't have mipmaps but I think I can sort of see how to add them, I don't know if the other rendering systems have them., but with something like

Code: Select all

    //-----------------------------------------------------------------------------------
    void MetalDepthTexture::_createSurfaceList()
    {
        mSurfaceList.clear();

        __unsafe_unretained id<MTLTexture> renderTexture = mTexture;
        __unsafe_unretained id<MTLTexture> resolveTexture = 0;
        
        if( mMsaaTexture )
        {
            renderTexture   = mMsaaTexture;
            resolveTexture  = mTexture;
        }

        
        for (uint8 face = 0; face < getNumFaces(); face++)
        {
            
            {
                v1::HardwarePixelBuffer *buf = OGRE_NEW v1::MetalDepthPixelBuffer( this, mName,
                                                                                  mWidth, mHeight,
                                                                                  mDepth, mFormat );
                
                mSurfaceList.push_back( v1::HardwarePixelBufferSharedPtr(buf) );
            }
            
            
            for (uint8 mip = 1; mip <= getNumMipmaps(); mip++)
            {
                v1::MetalHardwarePixelBuffer *buf = OGRE_NEW v1::MetalTextureBuffer(renderTexture, resolveTexture, mDevice, mName,
                                                                                    getMetalTextureTarget(), mWidth, mHeight, mDepth, mFormat,
                                                                                    static_cast<int32>(face), mip,
                                                                                    static_cast<v1::HardwareBuffer::Usage>(mUsage),
                                                                                    mHwGamma, mFSAA );
                
                mSurfaceList.push_back( v1::HardwarePixelBufferSharedPtr(buf) );

            }
        }
    }
somewhere mMipmapsDirty = true and somewhere where MetalRenderSystem::_createDepthBufferFor desc.mipmapLevelCount needs to be whatever it needs to be

could 'perhaps' get a mipmapped depth buffer this way.

If I ever get it working I'll post it.
berserkerviking
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 63
Joined: Tue May 02, 2017 8:15 pm
x 16

Re: [2.1] metal compute_hq shader

Post by berserkerviking »

Matias writes:
And the C++ code that handles this is isolated to:
OgreMain/src/Compositor/Pass/PassMipmap/OgreCompositorPassMipmap.cpp
The C++ treats Metal in a special way (different from D3D11 & GL) whenever it it does:
CODE: SELECT ALL
blurV2->getShaderParams( "metal" );

(note blurV2 is not the only variable doing this) where it pushes the LOD to a constant buffer.
Basically all code related to that would go away (including paramDstLodIdx from what I can see). What is related to blurV2->getShaderParams( "default" ); is common to all three APIs and should stay.
One of the things we haven't yet discussed is the fact that the problem is macOS only. iOS texture.write permits lod to be a variable.
So we want to do our planned uav binding trickery for macOS only. I think the best way to handle this is to insert #if's into MipmapsGaussianBlur_cs.metal code like this:

Code: Select all

@piece( image_store )
	@property( downscale_lq )
		@foreach( 2, iPixel )
#if defined(__ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__)
			outputImage.write( float4( outColour[ @iPixel ], 1.0 ), uint2( i2Center +  @iPixel * i2Inc ),
							   0 );
#else
			outputImage.write( float4( outColour[ @iPixel ], 1.0 ), uint2( i2Center +  @iPixel * i2Inc ),
							   p.dstLodIdx );
#endif
        @end
	@end
    @property( !downscale_lq )
		@foreach( 2, iPixel )
#if defined(__ENVIRONMENT_MAC_OS_X_VERSION_MIN_REQUIRED__)
			outputImage.write( float4( (outColour[ @iPixel * 2 ] + outColour[ @iPixel * 2 + 1 ]) * 0.5, 1.0 ),
							   uint2( i2Center +  @iPixel * i2Inc ),
							   0 );
#else
            outputImage.write( float4( (outColour[ @iPixel * 2 ] + outColour[ @iPixel * 2 + 1 ]) * 0.5, 1.0 ),
                  uint2( i2Center +  @iPixel * i2Inc ),
                  p.dstLodIdx );
#endif
        @end
	@end
@end
(The #if symbol I'm using isn't widely publicized; but I got it straight from the guy in charge of the Metal front-end compiler).

What this means is that we still need to push the lod to the constant buffer in the iOS case, in OgreCompositorPassMipmap.cpp.
However, this code is platform independent and doesn't easily have knowledge of whether it is running on iOS or macOS.

So I propose that we not touch OgreCompositorPassMipmap.cpp and continue to push the lod into the constant buffer for
both iOS and macOS. It is necessary for iOS and it shouldn't hurt anything on macOS; the lod in the constant buffer will
just be ignored.

How does this sound?
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5299
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1280
Contact:

Re: [2.1] metal compute_hq shader

Post by dark_sylinc »

I was thinking for the sake of simplicity and maintenance that iOS should follow macOS's route; as both shader and code need to be aware of the platform differences (shader needs to use 0 vs p.dstLodIdx; C++ needs to avoid creating a texture view in iOS, otherwise "lod 5" becomes "lod 8" because it's a relative parameter) unless there's a significant performance difference to justify the two paths.

I don't know if you disagree.

As for checking if it's iOS, you can also use:

Code: Select all

@property( iOS )
    iOS code here
@end
@property( !iOS )
    macOS code here
@end
berserkerviking
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 63
Joined: Tue May 02, 2017 8:15 pm
x 16

Re: [2.1] metal compute_hq shader

Post by berserkerviking »

Could you please clarify what you are suggesting? To me, this statement " iOS should follow macOS's route" sounds like you are suggesting that we use texture views for iOS. But the rest of what you says sounds like you are suggesting that we still take advantage of variable lod on iOS.
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5299
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1280
Contact:

Re: [2.1] metal compute_hq shader

Post by dark_sylinc »

I meant that iOS should also use texture views (because several shader & C++ places would otherwise have to be aware of the differences)

As for the property snippet, it was in case you disagreed with me and wanted to argue in favor of having iOS & macOS follow different paths.
berserkerviking
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 63
Joined: Tue May 02, 2017 8:15 pm
x 16

Re: [2.1] metal compute_hq shader

Post by berserkerviking »

I think it is better to use features that are supported by the platform, if possible. Since iOS supports variable lod in the
shader, we should use it. MacOS requires some emulation with the texture views, but we shouldn't force that complication
on the iOS users. It is just a bit of ifdef'ing, which I don't see as a major issue. I use ifdefs all the time.
Is it okay if I I proceed with mac-only texture view emulation?
User avatar
dark_sylinc
OGRE Team Member
OGRE Team Member
Posts: 5299
Joined: Sat Jul 21, 2007 4:55 pm
Location: Buenos Aires, Argentina
x 1280
Contact:

Re: [2.1] metal compute_hq shader

Post by dark_sylinc »

Well, go ahead and make separate paths then.

Though I'd prefer if you use @property( iOS ) instead of that ifdef; one because of consistency, two because I am fed up of future API versions suddenly screwing us up (specially if the ifdef isn't widely documented to guaranteed it's going to remain there)
berserkerviking
OGRE Retired Team Member
OGRE Retired Team Member
Posts: 63
Joined: Tue May 02, 2017 8:15 pm
x 16

Re: [2.1] metal compute_hq shader

Post by berserkerviking »

Yes. I definitely plan to use @property.
Post Reply